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RECOMBINANT DISULFIDE -STABILIZED POLYPEPTIDE 
PPAfSMgKT TS HAVING BINDING SPECIFICITY 

BACKGROUND OF THE INVENTION 
Field of the Invention 

The present invention relates to disulfide- 
stabilized (ds) recombinant polypeptide molecules, such as the 
variable region of an antibody molecule, which have the 
binding ability and specificity for another peptide. Methods 
of producing these molecules and nucleic acid sequences 
encoding these molecules are also described. 
In the Background 

Antibodies are molecules that recognize and bind to 
a specific cognate antigen. Numerous applications of 
hybridoma- produced monoclonal antibodies for use in clinical 
diagnosis, treatment, and basic scientific research have been 
described. Clinical treatments of cancer, viral and microbial 
infections, B cell immunodeficiencies, and other diseases and 
disorders of the immune system using monoclonal antibodies 
appear promising. Fv fragments of immunoglobulins are 
considered the smallest functional component of antibodies 
required for high affinity binding of antigen. Their small 
size makes them potentially more useful than whole antibodies 
for clinical applications like imaging tumors and directing 
recombinant immunotoxins to tumors since size strongly 
influences tumor and tissue penetration. 

Fv fragments are heterodimers of the variable heavy 
chain domain (V H ) and the variable light chain domain (V L ) . 
The heterodimers of heavy and light chain domains that occur 
in whole IgG, for example, are connected by a disulfide bond. 
The Fv fragments are not and therefore Fvs alone are unstable. 
Glockshuber et al . , Biochemistry 29:1362-1367 (1990). 
Recombinant Fvs which have V H and V L connected by a peptide 
linker are typically stable, see, for example, Huston et al., 
Proc. Natl. Acad. Sci . USA 85:5879-5883 (1988) and Bird et 
al., Science 242:423-426 (1988). These are single chain Fvs 
which have been found to retain specificity and affinity and 
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have been shown to be useful for imaging tumors and to make 
recombinant immunotoxins , for tumor therapy for example. 
However, researchers have found that some of the single chain 
Fvs have a reduced affinity for antigen and the peptide linker 
can interfere with binding. 

Another approach to stabilize the Fvs was attempted 
by Glockshuber et al . , supra. Disulfide bonds were placed in 
the complementarity determining regions (CDR) of an antibody 
whose structure was known in a manner that had limited or no 
effect on ligand binding. This approach is problematic for 
stabilizing other Fvs with unknown structures because the 
structure of each CDR region changes from one antibody to the 
next and because disulfide bonds that bridge CDRs will likely 
interfere with antigen binding. Thus, it would be desirable 
to have alternative means to stabilize the Fv portions of an 
antibody of interest which would allow the affinity for the 
target antigen to be maintained. 

SUMMARY OF THE INVENTION 
The invention relates to a polypeptide specifically 
binding a ligand, wherein the polypeptide comprises a first 
variable region of a ligand binding moiety bound through a 
disulfide bond to a second separate variable region of the 
ligand binding moiety, the bond connecting framework regions 
of the first and second variable regions. The polypeptide may 
be conjugated to a radioisotope, an enzyme, a toxin, or a drug 
or may be recombinantly fused to a toxin, enzyme or a drug, 
for example. Nucleic acid sequences coding the polypeptides 
and pharmaceutical compositions containing them are also 
disclosed. 

The polypeptide is preferably one, wherein the first 
variable region is a light chain variable region of an 
antibody and the second variable region is a heavy chain 
variable region of the antibody. The polypeptide may also be 
one, wherein the first variable region is an or variable chain 
region of a T cell receptor and the second variable region is 
a 0 variable chain region of the T cell receptor. 
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Methods for producing a disulfide ^stabilized 
polypeptide of a ligand binding moiety having a two variable 
regions are also disclosed comprising the following steps: 

(a) mutating a nucleic acid for the first variable 
region so that cysteine is encoded at position 42, 43, 44, 45 
or 46, and mutating a nucleic acid sequence for the second 
variable region so that cysteine is encoded at position 103, 
104, 105, or 106, such positions being determined in 
accordance with the numbering scheme published by Kabat and 
Wu, corresponding to a light chain and a heavy chain region, 
respectively, of an antibody; or 

(b) mutating a nucleic acid for the first variable 
region so that cysteine is encoded at position 43, 44, 45, 46 
or 47 and mutating a nucleic acid for the second variable 
region so that cysteine is encoded at position 98, 99, 100, or 
101 such positions being determined in accordance with the 
numbering scheme published by Kabat and Wu, corresponding to a 
heavy chain or a light chain region respectively of an 
antibody; then 

(c) expressing the nucleic acid for the first 
variable region and the nucleic acid for the second variable 
region in an expression system; and 

(d) recovering the polypeptide having a binding 
affinity for the antigen. 

The invention provides an alternative means to 
recombinant Fvs which have V H and V L connected by a peptide 
linker. Though such recombinant single chain Fvs are 
typically stable and specific, some have a reduced affinity 
for antigen and the peptide linker can interfere with binding. 
A means to produce recombinant Fv polypeptides that are 
stabilized by a disulfide bond located in the conserved 
regions of the Fv fragment and compositions that include 
these, such as immunotoxins , are also described. 

The clinical administration of the small 
polypeptides of the invention affords a number of advantages 
over the use of larger fragments or entire antibody molecules. 
The polypeptides of this invention in preferred forms have 
greater stability due to the additional disulfide bond. Due 
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to their smallsize they also offer fewer cleavage sites to 
circulating proteolytic enzymes resulting in greater 
stability. They reach their target tissue more rapidly, and 
are cleared more quickly from the body. They also have 
reduced immunogenicity . In addition, their small size 
facilitates specific coupling to other molecules in drug 
targeting and imaging applications. 

The invention also provides a means of stabilizing 
the antigen-binding portion (the V domain) of the T cell 
receptors, by connecting the a and 0 chains of the V domain by 
an inter- chain disulfide bond. Such stabilization of the V 
domain will help isolate and purify this fragment in soluble 
form. The molecule can then be used in applications similar 
to those of other Pvs. They can be used in diagnostic assays 
for tumor cells or for detection of immune-based diseases such 
as autoimmune diseases and AIDS. They may also have 
therapeutic use as a target for tumor cells or as a means to 
block undesirable immune responses in autoimmune diseases, or 
other immune -based disease. 

BRIEF DESCRIPTION OF THE FIGURES 
Figure 1: Sequence comparison of the heavy and 
light chain variable regions of MAb B3 (first row) and MAb 
MCPC603 (second row). The solid line and the dot(s) between 
two sequences indicate identity and similarity, respectively. 
A space was inserted between the framework (FR) and the 
complementarity determining (CDR) regions, which are indicated 
below the sequence. The residues that can be changed to Cys 
for preferred interchain disulfide bonds are boxed. The 
crosses at the top of the sequence indicate every 10th 
residue. The V H 95 Ser to Tyr mutation site and its pseudo- 
symmetry related Tyr residue in the light chain are indicated 
by a check mark (/) on top. In the sequence listing, heavy 
chain of MAb B3 is Seq. ID No. 1, heavy chain of MAb McPC603 
is Seq. ID No. 2, light chain of MAb B3 is Seq. ID No. 3, and 
light chain of MAb McPC603 is Seq. ID No. 4. The assignment 
of framework (FR1-4) and complementarity determining regions 
(CDRl-3) is according to Kabat et al . , infra. 
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Figure 2: Plasmids for expression of B3 (dsFv) - 
immuno toxins . Single stranded uracil containing DNA of pULI28 
was the template to mutate arg44 of B3 (V H ) and serlOS of 
B3 (V L ) to cys by Kunkel mutagenesis. The expression plasmid 
5 pYR38-2 for B3(V H cys44) was generated by deletion of a V L - 
PE38KDEL encoding EcoRI- fragment . pULI39 encoding 
B3 (V L cysl05) -PE38KDEL was constructed by subcloning a V L - 
cyslOS containing Pstl-Hindlll fragment into pULI21 that 
encodes B3 (V L ) -PE3 8KDEL. 

10 Figure 3: Specific cytotoxicity of B3 (dsFv) - 

PE38KDEL and B3 (Fv) - PE38KDEL towards different carcinoma cell 
lines. (a) Comparison of cytotoxicity of B3 (Fv) -PE38KDEL and 
B3 (dsFv) -PE38KDEL towards B3-antigen expressing A431 cell and 
B3-negative HUT-1002 cells; (b) Cytotoxicity of B3 (dsFv) - 

15 PE3 8KDEL towards various cell lines; (c) Competition of 

cytotoxicity towards A431 cells by addition of excess MAb B3.- 
Note that addition of equal amounts of isotype -matched 
control, MAb HB21, which binds to A431 cells but to a 
different antigen (transferrin receptor) does not compete. 

20 Figure 4: Amino acid sequence comparison of the 

heavy and light chain framework regions (FR2 and FR4 , 
respectively) of MAb (monoclonal antibody) McPC603 ("603") , 
MAb B3 ("B3"), MAb e23 ("e23") and MAb aTac ("aTac"). 

Figure 5: Plasmid construction for expression of 

25 e23 (dsFv) -PE38KDEL. 

DETAILED DESCRIPTION 
This invention discloses stable polypeptides which 
are capable of specifically binding ligands and which have two 

30 variable regions (such as light and heavy chain variable 

regions) bound together through a disulfide bond occurring in 
the framework regions of each variable region. These 
polypeptides are highly stable and have high binding affinity. 
They are produced by mutating nucleic acid sequences for each 

35 region so that cysteine is encoded at specific points in the 
framework regions of the polypeptide. 
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General Immunoglobulin Structure 

Members of the immunoglobulin family all share an 
immunoglobulin- like domain characterized by a centrally placed 
disulfide bridge that stabilizes a series of antiparallel & 
strands into an immunoglobulin- like fold. Members of the 
family (e.g., MHC class I, class II molecules, antibodies and 
T cell receptors) can share homology with either 
immunoglobulin variable or constant domains. An antibody 
heavy or light chain has an N- terminal (NH 2 ) variable region 
(V), and a C-terminal (-COOH) constant region (C) . The heavy 
chain variable region is referred to as V H , and the light 
chain variable region is referred to as V L . V H and V L 
fragments together are referred to as "Pv". The variable 
region is the part of the molecule that binds to the 
antibody's cognate antigen, while the constant region 
determines the antibody's effector function (e.g., complement 
fixation, opsonization) . Pull -length immunoglobulin or 
antibody "light chains" (generally about 25 kilodaltons (Kd) , 
about 214 amino acids) are encoded by a variable region gene 
at the N- terminus (generally about 110 amino acids) and a 
constant region gene at the COOH- terminus . Full-length 
immunoglobulin or antibody "heavy chains" (generally about 50 
Kd, about 446 amino acids) , are similarly encoded by a 
variable region gene (generally encoding about 116 amino 
acids) and one of the constant region genes (encoding about 
330 amino acids) . Typically, the "V L » will include the 
portion of the light chain encoded by the V L and J L (J or 
joining region) gene segments, and the "V H » will include the 
portion of the heavy chain encoded by the V H , and D H (D or 
diversity region) and J H gene segments. See generally, Roitt, 
et al., Immunology, Chapter 6, (2d ed. 1989) and Paul, 
Fundamental Immunology; Raven Press (2d ed. 1989) , both 
incorporated by reference herein. 

An immunoglobulin light or heavy chain variable 
region comprises three hypervariable regions, also called 
complementarity determining regions or CDRs, flanked by four 
relatively conserved framework regions or FRs. Numerous 
framework regions and CDRs have been described ( see . 
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"Sequences of Proteins of Immunological Interest," E. Kabat, 
et al., U.S. Government Printing Office, NIH Publication No, 
91-3242 (1991) ; which is incorporated herein by reference 
("Kabat and Wu")). The sequences of the framework regions of 
5 different light or heavy chains are relatively conserved. The 
CDR and FR polypeptide segments are designated empirically 
based on sequence analysis of the Fv region of preexisting 
antibodies or of the DNA encoding them. From alignment of 
antibody sequences of interest with those published in Kabat 

10 and Wu and elsewhere, framework regions and CDRs can be 

determined for the antibody or other ligand binding moiety of 
interest. The combined framework regions of the constituent 
light and heavy chains serve to position and align the CDRs, 
The CDRs are primarily responsible for binding to an epitope 

15 of an antigen and are typically referred to as CDR1 # CDR2, and 
CDR3 , numbered sequentially starting from the N- terminus of 
the variable region chain. Framework regions are similarly 
numbered. 

The general arrangement of T cell receptor genes is 

20 similar to that of antibody heavy chains, T cell receptors 

(TCR) have both variable domains (V) and constant (C) domains. 
The V domains function to bind antigen. There are regions in 
the V domain homologous to the framework CDR regions of 
antibodies. Homology to the immunoglobulin V regions can be 

25 determined by alignment. The V region of the TCRs has a high 
amino acid sequence homology with the Fv of antibodies. 
Hedrick et al . , Mature (London) 308:153-158 (1984), 
incorporated by reference herein. 

The term CDR, as used herein, refers to amino acid 

3 0 sequences which together define the binding affinity and 

specificity of the natural variable binding region of a native 
immunoglobulin binding site (such as Fv) , a T cell receptor 
(such as V a and Vp) , or a synthetic polypeptide which mimics 
this function. The term "framework region" or "FR", as used 

35 herein, refers to amino acid sequences interposed between 
CDRs . 

The "ligand binding moieties" referred to here are 
those molecules that have a variable domain that is capable of 
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functioning to bind specifically or otherwise recognize a 
particular ligand or antigen. Moieties of particular interest 
include antibodies and T cell receptors, as well as synthetic 
or recombinant binding fragments of those such as Fv, Fab, 
F(ab') 2 and the like. Appropriate variable regions include 
V H' V L' v a and v p and th « like. 

Practice of this invention preferably employs the Fv 
portions of an antibody or the V portions of a TCR only. 
Other sections, e.g., C H and C L , of native immunoglobulin 
protein structure need not be present and normally are 
intentionally omitted from the polypeptides of this invention. 
However, the polypeptides of the invention may comprise 
additional polypeptide regions defining a bioactive region, 
e.g., a toxin or enzyme, or a site onto which a toxin or a 
remotely detectable substance can be attached, as will be 
described below. 



Preparation of Fv Fragment s 

Information regarding the Fv antibody fragments or 
other ligand binding moiety of interest is required in order 
to produce proper placement of the disulfide bond to stabilize 
the desired disulfide stabilized fragment, such as an Fv 
fragment (dsFv) . The amino acid sequences of the variable 
fragments that are of interest are compared by alignment with 
those analogous sequences in the well-known publication by 
Kabat and Wu, supra, to determine which sequences can be 
mutated so that cysteine is encoded for in the proper position 
of each heavy and light chain variable region to provide a 
disulfide bond in the framework regions of the desired 
polypeptide fragment. Cysteine residues are necessary to 
provide the covalent disulfide bonds. For example, a 
disulfide bond could be placed to connect FR4 of V L and FR2 of 
V H ; or to connect FR2 of V L and FR4 of V H . 

After the sequences are aligned, the amino acid 
positions in the sequence of interest that align with the 
following positions in the numbering system used by Kabat and 
Wu are identified: positions 43, 44, 45, 46, and 47 (group 1) 
and positions 103, 104, 105, and 106 (group 2) of the heavy 
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chain variable region; and positions 42, 43, 44, 45, and 46 
(group 3) and positions 98, 99, 100, and 101 (group 4) of the 
light chain variable region. In some cases, some of these 
positions may be missing, representing a gap in the alignment. 

Then, the nucleic acid sequences encoding the amino 
acids at two of these identified positions are changed such 
that these two amino acids are mutated to cysteine residues. 
The pair of amino acids to be selected are, in order of 
decreasing preference: 

V H 44-V L 100, 

V H 105-V L 43, 

V H 105-V L 42, 

V H 44-V L 101, 

V H 106-V L 43, 

V H 104-V L 43, 

V H 44-V L 99, 

V H 45-V L 98, 

V H 46-V L 98, 

V H 103-V L 43, 

V H 103-V L 44. 

V H 103-V L 45. 

Most preferably, substitutions of cysteine are made at the 
positions : 

V H 44-V L 100; or 
V H 105-V L 43. 

(The notation V H 44-V L 100, for example, refers to a 
polypeptide with a V H having a cysteine at position 44 and a 
cysteine in V L at position 100; the positions being in 
accordance with the numbering given by Kabat and Wu.) 

Note that with the assignment of positions according 
to Kabat and Wu, the numbering of positions refers to defined 
conserved residues and not to actual amino acid positions in a 
given antibody. For example, CysLlOO (of. Kabat and Wu) which 
is used to generate ds(Fv)B3 as described in the example 
below, actually corresponds to position 105 of B3 (V T ) . 
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In the case of V a and Vp of T cell receptors, 
reference can also be made to the numbering scheme in Kabat 
and Wu for T cell receptors. Substitutions of cysteines can 
be made at position 41, 42, 43, 44 or 45 of V a and at position 

106, 107, 108, 109 or 110 of Vp; or at position 104, 105, 106, 

107, 108 or 109 of V a and at position 41, 42, 43, 44 or 45 of 
Vp, such positions being in accordance with the Kabat and Wu 
numbering scheme for TCRs . When such reference is made, the 
most preferred cysteine substitutions are V ff 42-Vpll0 and 
V a 108-Vp42. Vp positions 106, 107 and V a positions 104, 105 
are CDR positions, but they are positions in which disulfide 
bonds can be stably located. 

As an alternative to identifying the amino acid 
position for cysteine substitution with reference to the Kabat 
and Wu numbering scheme, one could align a sequence of 
interest with the sequence for monoclonal antibody (MAb) B3 
(see below) set out in Figure 1. The amino acid positions of 
B3 which correlate with the Kabat and Wu V H positions set 
forth above for Group 1 are 43, 44, 45, 46, and 47, 
respectively; for Group 2 are 109, 110, 111, and 112, 
respectively. The amino acid positions of B3 which correlate 
with the Kabat and Wu V L positions set forth above for Group 3 
are 47, 48, 49, 50 and 51, respectively; Group 4 are 103, 104, 
105, and 106, respectively. 

Alternatively, the sites of mutation to the cysteine 
residues can be identified by review of either the actual 
antibody or the model antibody of interest as exemplified 
below. Computer programs to create models of proteins such as 
antibodies are generally available and well-known to those 
skilled in the art (see Kabat and Wu; Loew, et al., Jut. J. 
Quant. Chem., Quant. Biol. Symp. , 15:55-66 (1988); Bruccoleri, 
et al., Mature, 335:564-568 (1988); Chothia, et al . , Science, 
233:755-758 (1986), all of which are incorporated herein by 
reference. Commercially available computer programs can be 
used to display these models on a computer monitor, to 
calculate the distance between atoms, and to estimate the 
likelihood of different amino acids interacting (see, Ferrin, 
et al., J. Mol. Graphics, 6 : 13 -27 (1988) , incorporated by 



WO 94/29350 



PCT/US94/06687 



11 

reference herein) . For example, computer models can predict 
charged amino acid residues that are accessible and relevant 
in binding and then conf ormationally restricted organic 
molecules can be synthesized. See, for example, Saragovi, et 
al.. Science, 253:792 (1991), incorporated by referenced 
herein. In other cases, an experimentally determined actual 
structure of the antibody may be available. 

A pair of suitable amino acid residues should (l) 
have a C 0 -C a distance between the two residues less than or 
equal to 8 A, preferably less than or equal to 6.5 A 
(determined from the crystal structure of antibodies which are 
available such as those from the Brookhaven Protein Data Bank) 
and (2) be as far away from the CDR region as possible. Once 
they are identified, they can be substituted with cysteines. 
The C a -C a distances between residue pairs in the modeled B3 at 
positions homologous to those listed above are set out in 
Table 1, below. 

Introduction of one pair of cysteine substitutions ' 
will be sufficient for most applications. Additional 
substitutions may be useful and desirable in some cases. 

Modifications of the genes to encode cysteine at the 
target point may be readily accomplished by well-known 
techniques, such as site -directed mutagenesis (see, Gillman 
and Smith, Gene, 8:81-97 (1979) and Roberts, S., et al, 
Mature, 328:731-734 (1987), both of which are incorporated 
herein by reference), by the method described in Kunkel, Proc. 
Natl. Acad. Sci. USA 82:488-492 (1985), incorporated by 
reference herein, or by any other means known in the art. 

Separate vectors with sequences for the desired V H 
and V L sequences (or other homologous V sequences) may be made 
from the mutagenized plasmids. The sequences encoding the 
heavy chain regions and the light chain regions are produced 
and expressed in separate cultures in any manner known or 
described in the art, with the exception of the guidelines 
provided below. If another sequence, such as a sequence for a 
toxin, is to be incorporated into the expressed polypeptide, 
it can be linked to the V H or the V L sequence at either the N- 
or C- terminus or be inserted into other protein sequences in a 
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suitable position. For example, for Pseudomonas exotoxin (PE) 
derived fusion proteins, either V H or V L should be linked to 
the N- terminus of the toxin or be inserted into domain III of 
PE, like for example TGFar in Theuer et al., J. Urology 149 
5 (1993) , incorporated by reference herein. For Diphtheria 

toxin-derived immunotoxins, V H or V L is preferably linked to 
the C- terminus of the toxin. 

Peptide linkers, such as those used in the 
expression of recombinant single chain antibodies, may be 

10 employed to link the two variable regions (V H and V L , V a and 
Vp) if desired and may positively increase stability in some 
molecules. Bivalent or multivalent disulfide stabilized 
polypeptides of the invention can be constructed by connecting 
two or more, preferably identical, V H regions with a peptide 

15 linker and adding V L as described in the examples, below. 

Connecting two or more V H regions by linkers is preferred to 
connecting V L regions by linkers since the tendency to form 
homodimers is greater with V L regions. Peptide linkers and 
their use are well-known in the art. See, e.g., Huston et 

20 al., Proc. Natl. Acad. Sci. USA, supra; Bird et al . , Science, 
supra; Glockshuber et al., supra; U.S. Patent No. 4,946,778, 
U.S. Patent No. 5,132,405 and most recently in Stemmer et al., 
Biotechniques 14:256-265 (1993), all incorporated herein by 
reference. 

25 Proteins of the invention can be expressed in a 

variety of host cells, including E. coli, other bacterial 
hosts, yeast, and various higher eucaryotic cells such as the 
COS, CHO and HeLa cells lines and myeloma cell lines. The 
recombinant protein gene will be operably linked to 

30 appropriate expression control sequences for each host. For 

E. coli this includes a promoter such as the T7, trp, tac, lac 
or lambda promoters, a ribosome binding site, and preferably a 
transcription termination signal. For eucaryotic cells, the 
control sequences will include a promoter and preferably an 

35 enhancer derived from immunoglobulin genes, SV40, 

cytomegalovirus, etc., and a polyadenylation sequence, and may 
include splice donor and acceptor sequences. The plasmids of 
the invention can be transferred into the chosen host cell by 
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well-known methods such as calcium chloride transf ormation for 
E. coli and calcium phosphate treatment or electroporation for 
mammalian cells. Cells transformed by the plasmids can be 
selected by resistance to antibiotics conferred by genes 
contained on the plasmids, such as the amp, gpt, neo and hyg 
genes . 

Methods for expressing of single chain antibodies 
and/or refolding to an appropriate folded form, including 
single chain antibodies, from bacteria such as E. coli have 
been described and are well -known and are applicable to the 
polypeptides of this invention. See, Buchner et al . , 
Analytical Biochemistry 205:263-270 (1992); Pluckthun, 
Biotechnology, 9:545 (1991); Huse, et al . , Science, 246:1275 
(1989) and Ward, et al., Mature, 341:544 (1989), all 
incorporated by reference herein. 

Often, functional protein from E. coli or other 
bacteria is generated from inclusion bodies and requires the 
solubilization of the protein using strong denaturants, and 
subsequent refolding. In the solubilization step, a reducing 
agent must be present to dissolve disulfide bonds as is well- 
known in the art. An exemplary buffer with a reducing agent 
is: 0.1 M Tris, pH8, 6M guanidine, 2 mM EDTA, 0.3 M DTE 
(dithioerythritol) . Reoxidation of protein disulfide bonds 
can be effectively catalyzed in the presence of low molecular 
weight thiol reagents in reduced and oxidized form, as 
described in Saxena et al . , Biochemistry 9: 5015-5021 (1970), 
incorporated by reference herein, and especially described by 
Buchner, et al . , Anal. Biochem. , supra (1992). 

Renaturation is typically accomplished by dilution 
(e.g. 100 -fold) of the denatured and reduced protein into 
refolding buffer. An exemplary buffer is 0 . 1 M Tris, pH8.0, 
0.5 M L-arginine, 8 mM oxidized glutathione (GSSG) , and 2 mM 
EDTA. 

As a necessary modification to the single chain 
antibody protocol, the heavy and light chain regions were 
separately solubilized and reduced and then combined in the 
refolding solution. A preferred yield is obtained when these 
two proteins are mixed in a molar ratio such that a molar 
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excess of one protein over the other does not exceed a 5 fold 
excess . 

It is desirable to add excess oxidized glutathione 
or other oxidizing low molecular weight compounds to the 
refolding solution after the redox- shuf fling is completed. 

Purification of polype ptides. 

Once expressed, the recombinant proteins can be 
purified according to standard procedures of the art, 
including ammonium sulfate precipitation, affinity columns, 
column chromatography, and the like (see, generally, R. 
Scopes, Protein Purification, Springer- Verlag, N. Y. (1982)). 
Substantially pure compositions of at least about 90 to 95% 
homogeneity are preferred, and 98 to 99% or more homogeneity 
are most preferred for pharmaceutical uses. Once purified, 
partially or to homogeneity as desired, the polypeptides 
should be substantially free of endotoxin for pharmaceutical 
purposes and may then be used therapeutically. 

Various dsFv fragment molecules 

It should be understood that the description of the 
dsFv peptides described above can cover all classes/groups of 
antibodies of all different species (e.g., mouse, rabbit, 
goat, human) chimeric peptides, humanized antibodies and the 
like. "Chimeric antibodies" or "chimeric peptides" refer to 
those antibodies or antibody peptides wherein one portion of 
the peptide has an amino acid sequence that is derived from, 
or is homologous to, a corresponding sequence in an antibody 
or peptide derived from a first gene source, while the 
remaining segment of the chain (s) is homologous to 
corresponding sequences of another gene source. For example, 
chimeric antibodies can include antibodies where the framework 
and complementarity determining regions are from different 
sources. For example, non-human CDRs are integrated into 
human framework regions linked to a human constant region to 
make "humanized antibodies." See, for example, PCT 
Application Publication No. WO 87/02671, U.S. Patent No. 
4,816,567, EP Patent Application 0173494 , Jones, et al . , 
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Mature, 321:522-525 (1986) and Verhoeyen, et _ al . , Science, 
239:1534-1536 (1988), all of which are incorporated by 
reference herein. Similarly, the source of V H can differ from 
the source of V L . 

The subject polypeptides can be used to make fusion 
proteins such as immunotoxins. Immunotoxins are characterized 
by two functional components and are particularly useful, for 
killing selected cells in vitro or in vivo. One functional 
component is a cytotoxic agent which is usually fatal to a 
cell when attached or absorbed to the cell. The second 
functional component, known as the "delivery vehicle, " 
provides a means for delivering the toxic agent to a 
particular cell type, such as cells comprising a carcinoma. 
The two components can be recombinantly fused together via a 
peptide linker such as described in Pastan et al . , Ann. Rev. 
Biochem. (1992) , infra. The two components can also be 
chemically bonded together by any of a variety of well-known * 
chemical procedures. For example, when the cytotoxic agent is 
a protein and the second component is an intact 
immunoglobulin, the linkage may be by way of 

heterobifunctional cross - linkers , e.g., SPDP, carbodiimide, or 
the like. Production of various immunotoxins is well-known 
within the art, and can be found, for example in "Monoclonal ' 
Antibody- Toxin Conjugates: Aiming the Magic Bullet," Thorpe et 
al., Monoclonal Antibodies in Clinical Medicine, Academic 
Press, pp. 168-190 (1982) and Waldmann, Science, 252:1657 
(1991) , both of which are incorporated herein by reference. 

A variety of cytotoxic agents are suitable for use 
in immunotoxins. Cytotoxic agents can include radionuclides, 
such as Iodine-131, Yttrium-90, Rhenium-188, and Bismuth-212 ; 
a number of chemotherapeutic drugs, such as vindesine, 
methotrexate, adriamycin, and cisplatin; and cytotoxic 
proteins such as ribosomal inhibiting proteins like pokeweed 
antiviral protein, Pseudomonas exotoxin A, ricin, diphtheria 
toxin, ricin A chain, gelonin, etc., or an agent active at the 
cell surface, such as the phospholipase enzymes (e.g., 
phospholipase C) . (See, generally, Pastan et al . , 
"Recombinant Toxins as Novel Therapeutic Agents, " Ann. Rev. 
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Biochem. 61:331-354 (1992); "Chimeric Toxins," Olsnes and 
Phil, Pharmac. Ther. , 25:355-381 (1982), and "Monoclonal 
Antibodies for Cancer Detection and Therapy," eds. Baldwin and 
Byers, pp. 159-179, 224-266, Academic Press (1985), which are 
5 incorporated herein by reference.) 

The polypeptides can be conjugated or recombinant ly 
fused to a variety of pharmaceutical agents in addition to 
those described above, such as drugs, enzymes, hormones, 
chelating agents capable of binding an isotope, catalytic 

10 antibodies and other proteins useful for diagnosis or 
treatment of disease. 

For diagnostic purposes, the polypeptides can either 
be labeled or unlabeled. A wide variety of labels may be 
employed, such as radionuclides, fluors, enzymes, enzyme 

15 substrates, enzyme cof actors, enzyme inhibitors, ligands 
(particularly haptens) , and the like. Numerous types of 
immunoassays are available and are well known to those skilled 
in the art. 

2 0 Molecules homologous to antibod y Fv domains - T-cell receptors 

This invention can apply to molecules that exhibit a 
high degree of homology to the antibody Fv domains, including 
the ligand- specif ic V- region of the T-cell receptor (TCR) . An 
example of such an application is outlined below. The 

25 sequence of the antigen- specif ic V region of a TCR molecule, 
2B4 (Becker et al . , Nature (London) 317:430-434 (1985)), was 
aligned against the Fv domains of two antibody molecules 
McPC603 (see below) and J539 (Protein Data Bank entry 2FBJ) , 
using a standard sequence alignment package. When the V 

30 sequence of 2B4 was aligned to the V H sequences of the two 

antibodies, the SI site residue, corresponding to V H 44 of B3, 
can be identified as V a 43S (TCR 42 in the numbering scheme of 
Kabat and Wu) and the S2 site residue, corresponding to V H lll 
of B3, as V a 104Q (TCR 108 in the numbering scheme of Kabat and 

35 Wu) . When the same V a sequence was aligned to the V L 

sequences of the two antibodies, the same residues, V a 43S and 
V a 104Q, can be identified, this time aligned to the residues 
corresponding to V L 48 and V L 105 of B3 , respectively. 
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Similarly, the_2B4 residues V p 42E and V p 107P (TCR 42 and 110 
in the numbering scheme of Rabat, et al . ) can be aligned to 
antibody residues corresponding to V H 44 and V H lll of B3 and at 
the same time to V L 48 and V L 105 of B3 . Therefore, the two 
5 most preferred interchain disulfide bond sites in this TCR are 
V a 43 - Vpl07 and V a 104 - V p 42 . Mutating the two residues in 
one of these pairs of residues into cysteine will introduce a 
disulfide bond between the a and 0 chains of this molecule. 
The stabilization that results from this disulfide bond will 
10 make it possible to isolate and purify these molecules in 
large quantities . 

Binding Affinity of dsFv polypeptides . 

The polypeptides of this invention are capable of 

15 specifically binding a ligand. For this invention, a 

polypeptide specifically binding a ligand generally refers to 
a molecule capable of reacting with or otherwise recognizing 
or binding antigen or to a receptor on a target cell. An 
antibody or other polypeptide has binding affinity for a 

20 ligand or is specific for a ligand if the antibody or peptide 
binds or is capable of binding the ligand as measured or 
determined by standard antibody-antigen or ligand- receptor 
assays, for example, competitive assays, saturation assays, or 
standard immunoassays such as ELISA or RIA. This definition 

25 of specificity applies to single heavy and/or light chains, 
CDRs, fusion proteins or fragments of heavy and/or light 
chains, that are specific for the ligand if they bind the 
ligand alone or in combination. 

In competition assays the ability of an antibody or 

3 0 peptide fragment to bind a ligand is determined by detecting 
the ability of the peptide to compete with the binding of a 
compound known to bind the ligand. Numerous types of 
competitive assays are known and are discussed herein. 
Alternatively, assays that measure binding of a test compound 

35 * in the absence of an inhibitor may also be used. For 

instance, the ability of a molecule or other compound to bind 
the ligand can be detected by labelling the molecule of 
interest directly or the molecule be unlabelled and detected 
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indirectly using various sandwich assay formats. Numerous 
types of binding assays such as competitive binding assays are 
known (see, e.gr., U.S. Patent Nos. 3,376,110, 4,016,043, and 
Harlow and Lane, Antibodies: A Laboratory Manual, Cold Spring 
Harbor Publications, N.Y. (1988), which are incorporated 
herein by reference) . Assays for measuring binding of a test 
compound to one component alone rather than using a 
competition assay are also available. For instance, 
immunoglobulin polypeptides can be used to identify the 
presence of the ligand. Standard procedures for monoclonal 
antibody assays, such as ELISA, may be used (see, Harlow and 
Lane, supra) . For a review of various signal producing 
systems which may be used, see, U.S. Patent No. 4,391,904, 
which is incorporated herein by reference. 

The following examples are offered for the purpose 
of illustration and are not to be construed as limitations on 
the invention. 

EXAMPLES 

The computer modeling and identification of residues 
in the conserved framework regions of V H and V L of the 
monoclonal antibody (MAb) B3 and MAb e23 that can be mutated 
to cysteines and form a disulf ide- stabilized Fv without 
interfering with antigen binding are disclosed. B3 reacts 
with specific carbohydrates present on many human cancers. 
(Pastan et al . , Cancer Res. 51:3781-3787 (1991), incorporated 
by reference herein.) MAb e23 reacts specifically with the 
erbB2 antigen present on many human carcinomas. Active 
immunotoxins containing such a disulf ide- stabilized Fv are 
also described. 

I. Design of a disulfide connection between V„ and V„ 
of MAb B3 which does not affect the structure of the binding 
site. 

A. Design Approach 

Because the tertiary structure of MAb B3 is not 
known, we generated a model of B3 (Fv) from the structure of 
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MAb MCPC603 (see below) by replacing or deleting appropriate 
amino acids. MAb McPC603 was selected because it has the 
highest overall (L+H) sequence identity and similarity among 
all published mouse antibody structures. A total of 44 
(including 2 deletions) and 40 (including l deletion) amino 
acids of the V H and V L domains, respectively, of McPC603 were 
changed. No insertion was necessary. This structure was then 
energy -minimi zed using CHARMM (see below) in stages; first 
only the hydrogen atoms were varied, then the deleted regions, 
then all the mutated residues, and finally the whole molecule. 

Three criteria were used to select possible 
positions for disulfide -connections between V H and V L . (i) 
The disulfide should connect amino acids in structurally 
conserved framework regions of V H and V L , so that the 
disulfide stabilization works not only for B3 (Fv) but also for 
other Fvs. (ii) The distance between V H and V L should be 
small enough to enable the formation of a disulfide without 
generating strain on the Fv structure. (iii) The disulfide 
should be at a sufficient distance from the CDRs to avoid 
interference with antigen binding. These criteria were met by 
the following two potential disulfide bridges, although there 
are other potential sites around the two sites as shown in 
Table l. One possibility was to replace arg44 of B3 (V H ) and ' 
serl05 of B3(V L ) with cysteines to generate a disulfide 
between those positions. The other was to change glnlll of 
B3 (V H ) and ser48 of B3 (V L ) to cysteines (See Figure 1). These 
two pairs are related to one another by the pseudo two -fold 
symmetry that approximately relates the V H and V L structures. 
In each case, one of the residues involved in the putative 
disulfide bond (V H lli, V L 105) is flanked on both sides by a 
highly conserved Gly residue which can help absorb local 
distortions to the structure caused by the introduction of the 
disulfide bond. We energy -minimi zed models for both 
possibilities as well as one in which both disulfide bonds are 
present. The V H 44-V L 105 connection was chosen for further 
study because the energy- refined model structure with this 
connection had a slightly better disulfide bond geometry than 
that with the V H lli-V L 48 connection. With some other 
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antibodies this latter connection may be preferable over the 
former. 

B. Computer Modeling. 

The initial model of the B3 (Fv) structure was 
obtained from the structure of the variable domain of McPC603 
(Satow 'et aI M J. Mol. Biol. 190:593-604 (1986)), Brookhaven 
Protein Data Bank (Brookhaven National Laboratory, Upton, Long 
Island, New York) entry 1MCP, (Abola et al., Crystal lographic 
Databases -Information Content, Software Systems, Scientific 
Applications pp. 107-132 (1987)) by deletion and mutation of 
appropriate residues using an in- house molecular graphics 
program known as GEMM. The structure of this model and those 
of various mutants were refined by a series of the adopted 
basis set Newton Ralphson (ABNR) energy minimization procedure 
using the molecular dynamics simulation program CHARMM (as 
described in Brooks et al . , J. Comp. Chem. 4:187-217 (1983), 
incorporated by reference herein) version 22. Details of this 
procedure are as follows: 

1. Energy Minimization 

All structural refinements were performed by the 
ABNR (adopted basis set Newton Ralphson) energy minimization 
procedure using the molecular dynamics simulation program 
CHARMM (Brooks et al., supra), version 22. All-H parameter 
set was used; nonbond cutoff distance was 13.0 A, with 
switching function applied to the Lennard- Jones potential and 
shifting function to the electrostatic interactions between 10 
and 12 A. Solvent was not included. The dielectric constant 
of l was used for all refinements except for the last runs, 
for which a distance -dependent dielectric constant was used. 

2. Construction of the wild- type B3 Fv model 

A model of the B3 Fv structure was first obtained 
from the structure of the Fv domain of McPC603 (Satow et al., 
supra; Protein Data Bank entry 1MCP, Abola et al. t supra) by 
deletion and mutation of appropriate residues using a 
molecular graphics program GEMM. The sequence alignment 
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scheme used to find corresponding residues, was that of Kabat 
and Wu (supra, Fig. 1) . The McPC603 structure was chosen over 
other known mouse Fab structures (e.g., J539 and R19.9, 
Protein Data Bank entries 2FBJ and 2F19, respectively) because 
its Fv portion has the highest overall identity and similarity 
in the amino acid sequence with that of B3 . A total of 44 
(including two deletions) and 40 (including one deletion) 
amino acids of the V H and V L domains, respectively, of McPC603 
were changed, but no insertion was needed. 

This initial structure was then refined according to 
the following protocol: (1) Hydrogen atoms (both polar and 
nonpolar) were added using CHARMM and their positions refined 
by a 50 -step energy minimization with the heavy atoms fixed. 
(2) In order to allow the C-N bond length reduction around the 
deletion regions, a 20-step energy minimization was done with 
all atoms fixed except those for 10 amino acids around each of 
the three deletion regions (V L :28-37, V H :50-59. 99-108), which 
were constrained with mass -weighted harmonic force of 20 
kcal/mol/A. This minimization was repeated with the harmonic 
constraint force of 15, 10, and then 5 kcal/mol/A, each for 20 
steps. (3) The same set of constrained minimizations of step 
(2) was repeated using an expanded list of variable amino 
acids to include all the mutated amino acids as well as the 30 
amino acids around the deletion regions. (4) Finally, the 
same set of constrained minimizations was repeated to refine 
all atoms in the structure. The structure obtained after this 
set of refinements served as the starting structure for the 
disulfide bond introduction between V H and V L domains and for 
the Ser to Tyr mutation (see below) . The final structure of 
the wild- type B3 Fv was obtained after two additional sets of 
energy minimizations using a distance -dependent dielectric 
constant (see below) . 

3. Construction of the Tyr mutant model 
During the examination of the newly constructed 
structure of B3 Fv, it was noted that there was an empty 
concave space in the V H -V L interface region of the FR core of 
the B3 Fv model structure, near the Ser side chain at V„95 
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position (V H 9l_in Kabat and Wu, supra.). Other crystal 
structures of Fab have either Tyr or Phe at the corresponding 
position. The sequence data in Kabat et al . (supra) also show 
that this position is most often occupied by either Tyr or 
5 Phe. Thus, Ser at this position in B3 appears to be an 

anomaly. Furthermore, it was apparent from visual inspection 
that the side chain of Tyr at this position would fill the 
nearby empty space very nicely with hardly any change at all 
in the rest of the structure and that this would promote the 

10 v h" v l association by enhancing the hydrophobic and van der 
Waals interactions. We, therefore, constructed and 
energy- refined the Tyr mutant structure. 

The protocol used to construct the Tyr mutant model 
was similar to that used to construct the B3 Fv model: (1) 

15 The Ser residue of V H was replaced by Tyr using GEMM. (2) 
Hydrogen atoms were added and their positions refined using 
CHARMM by a 20-step energy minimization with all other atoms 
fixed. (3) all atoms of the new Tyr residue were allowed to 
vary during the next 20-step minimization with all other atoms 

20 fixed. (4) Finally, all atoms of the structure were allowed to 
relax in stages by means of four successive sets of 20-step 
energy minimization, each set with the mass-weighted harmonic 
constraint force of 20, 15, 10, and then 5 kcal/mol/A. 

25 4. Selection of possible disulfide bond position 

between V H and V L domains. 

Possible mutation sites for the introduction of a 
disulfide bond between the V H and V L domains were initially 
identified by visual inspection of the initial model of B3 

30 using our molecular graphics program, GEMM. The criteria for 
selection were, (1) that both of the pair of residues to be 
mutated to Cys be in the FR- region of the molecule, at least 
one residue away from the CDRs in the primary sequence and (2) 
that the C a -C a distance between the two residues be less than 

35 or equal to 6.5 A. Two pairs could be identified: 

V H 44R-V L 105S and V H 111Q- V L 48S . After the B3 model structure 
had been fully refined, the program CHARMM was used to 
systematically search for all residue pairs between the FR 
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regions of V H and V L domains, for which the^-Cj, distance was 
less than a specified value. The result of this search is 
summarized in Table 1, which shows that the C a -C a distance is 
the shortest at the two sites identified with the initial 
model of B3 , but that other sites exist that are also 
potential candidates . 



Table 1. All C a -C a distances (in Angstroms) less 
than or equal to 8 . 0 A between the FR regions of V H and V L of 
the Pv B3 model structure. 



V„43 
V„44 



V L 105 
V L 103 



V H 44-V L 104 
V H 44-V L 105 



V„44- 



V L 106 



V H 45-V L 103 
V h 45tV l 104 
V H 45-V L 105 



v„46- 
V H 46- 
V H 47- 



V L 102 a 
V L 103 
V L 101 a 



V H 47-V L 102a 
V H 47-V L 103 



8.0 
7.5 
7.2 
5.7 



6 
6 
7 
8 

7. 

6 
6. 

6, 
7, 



4 
0 
7 
0 
3 
9 
4 
8 
8 



V L 47-V H 111 
V L 47-V H 112 
V L 48-V H 95 
V L 48-V H 109 
V L 48-V H 110 
V L 48-V H 111 



V L 48 



■V H 112 



V L 49 -V„109 
V L 50-V H 108 a 
V L 50-V H 109 
V L 51-V H 107 a 



6.9 
8.0 
7.4 
7.0 
6.8 
5.6 
6.5 
7.0 
7.5 
6.9 
7.0 



a These residues are in the CDR region, but have close 
proximity to the FR region. 



V H positions 43, 101 and 102 and V L positions 96 and 
97 are located in the CDR region, but do yield stable ds bonds 
when substituted with cysteines, while maintaining binding 
specificity. 



5. Construction of the disulfide -bonded B3 Fv 

models. 

Once these potential disulfide bond sites were 
identified, six disulfide bonded models were generated. Three 
of these were "s44" (B3 Fv with V H 44R and V L 105S changed to 
Cys and disulfide bonded) , "sill" (B3 Fv with V H lllQ and V L 48S 
changed to Cys and disulfide bonded), and "s44,lli ,! (B3 Fv 
with both disulfide bonds) . The other three were the 
corresponding disulfide bonded forms of the Tyr mutant, B3 
yFv. These are labelled as y44, ylll, and y44,lll. All six 
model structures were refined by energy minimization using an 
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identical protocol. This consisted of (1) mutation of the 
appropriate residues using GEMM, (2) addition of the hydrogen 
atoms, (3) allowing the disulfide bond(s) to form by relaxing 
the Cys residues and the two neighboring Gly residues by a 
100-step energy minimization with all other atoms fixed, (4) 
refinement of all atoms of the structure by four successive 
sets of 20-step energy minimizations, each with the mass- 
weighted harmonic constraint force of 20 , 15, 10, and then 5 
kcal/mol/A. Afterwards, all structures were subjected to the 
final refinements as described below. 

6. Generation of the final models of the wild- type 
and different variants of B3 Fv\ 

The constructed models of B3 Fv and of all of its 
variants were subjected to an additional 500-step minimization 
followed by another 500-step procedure with the exit criterion 
being to stop the run when the total energy change becomes 
less than or equal to 0.01 kcal/mol. These final calculations 
were carried out without any constraint and using the 
distance -dependent dielectric constant. The various energy 
values reported in Table 2 are from the last cycle of these 
calculations. 
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Table 2. The energy components, in kcal/mol, of B3 
Fv, of the species with an interchain disulfide bond at 
V H 44-V L 105 (s44), at V H 111-V L 48 (sill), at both sites 
(s44,lll), and of their corresponding variants with Ser to Tyr 
mutation at V H 95 (B3 yPv, y44, ylll, and y44,lll). 





B3 Fv 


p44 


Sill 


s44 . Ill 


Sl a 


-35.4 


33.8 


-35.2 


34 . 6 


S2 b 


23 .4 


23.5 


39 . 8 


39.7 


R c 


-893 .9 


-909 . 7 


- 826 . 6 


- 833 . 0 


Sl-R d 


-65.1 


-25.9 


- 64 . 5 


-25 5 


S2-R e 


-58 . 6 


-58.8 






V H- V L f 


-172.1 


-150.4 


-150.5 


± J. W • -7 


Total 9 


-1029 .6 


-937.1 


-915.9 


-813.7 




B3 vFv 


y44 


vlll 


v44. Ill 


Sl a 


-35.1 


33 .9 


-35.3 


33.4 


S2 b 


23.8 


22.7 


40.4 


39.9 


R c 


-910.1 


-943.3 


-902 .6 


-912.0 


Sl-R d 


-66.0 


-26.4 


-65.2 


-26.2 


S2-R e 


-63.8 


- 74 . 5 


-33.2 


-33.1 




-192.6 


-161.3 


-177.6 


-141.6 


Total 9 


-1051.2 


-987.6 


-996.0 


-898.1 



a Residues V H 44 (R or C) and V L 105 (S or C) . 
b Residues V H lll (Q or C) and V L 4 8 (S or C) . 
c Rest of the molecule other than SI and S2 . 
interaction energy between groups SI and R. 
e Interaction energy between groups S2 and R. 
f Interact ion energy between V H and V L . 

^Sum of the energies for SI, S2, R, Sl-R, and S2-R, plus the 
interaction energy between Si and S2, which is negligible for 
all molecules. 

7. Model of B3 Fv fragment. 

The refined model of B3 Fv structure can be compared 
to the (unrefined) crystal structure of McPC603 (not shown) . 
The rms deviations between the C a atoms of these two 
structures, excluding the deleted residues, were 0.75, 1.18, 
and 0.91 A, respectively, for the FR-region, CDR-region, and 
the whole molecule. Most of the difference occurs at the 
loops and at the C- and N- terminals of the molecule. Some of 
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the difference_ between these structures is probably also due 
to the fact that one is energy- refined and the other not. The 
McPC603 structure was not refined because an energy -minimi zed 
structure is not necessarily more reliable than the crystal 
structure, especially when the refinements are carried out 
without the solvent water. 



8. Tyr mutant of B3 Fv (B3 yFv) 

As described above, we constructed a mutant of B3 Fv 
wherein the Ser residue at V H 95 is replaced by a Tyr residue. 
The effect of this mutation upon the stability of Fv cannot be 
computed quantitatively because of the lack of information on 
the structure of the dissociated, unfolded form of Fv. The 
numbers that are produced naturally during the structure 
refinement are various energy terms in the folded form of the 
molecule. When the Ser side chain was replaced by that of 
Tyr, the Lennard- Jones potential energy of the mutated residue 
with the rest of the protein was 1.79 kcal/mol before the 
hydrogen atoms were refined, 0.05 kcal/mol after a 20-step of 
minimization of the hydrogen atoms only, and -20 kcal/mole 
after full refinement of all atoms. These numbers indicate 
that the modeled B3 Fv structure can accommodate a Tyr residue 
at this position without any serious steric overlap. The 
various energy terms after full refinement of all atoms are 
listed in Table 2. It can be seen that the Tyr mutant always 
has lower energy than its Ser counterpart, both in the 
wild- type and in all of the Cys mutants. The rms deviation 
between the main- chain atoms of B3 Fv and B3 yFv was 0.15 A. 

9. Models of disulfide bonded B3 Fv fragments. 

The two sites selected for a potential inter- chain 
disulfide bond formation are site SI at V H 44R-v L i05S and site 
S2 at V H 111Q-V L 48S (V H 44-V L 100 and V H 105-V L 43, respectively, 
according to the numbering scheme of Kabat et al . , supra). 
These sites are in the FR region, at least two residues away 
from the nearest CDR region. The inter- chain C a -C a distance 
was the shortest in the unrefined model and is the shortest in 
the refined model (Table 1) . it was also noted that one of 
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the residues in each pair, V L 105 and V H 111^ is flanked on both 
sides by a highly conserved Gly residue. We reasoned that 
these Gly residues would provide flexibility to the middle 
residue and absorb some of the distortions that could be 
produced when a disulfide bond is formed. 

We constructed both the singly and doubly disulfide 
bonded models, each with or without the Ser to Tyr mutation at 
V H 95. The structural change upon introduction of the 
disulfide bond is small if computed as an average per residue 
- the rms deviations between the main- chain atoms of the 
disulfide bonded variants and those of their parent molecules 
were 0.2 to 0.3 A. However, significant changes do occur at 
the site of mutation as is inevitable since the C a -C a distance 
must decrease by 0.5 to 1.0 A. (See Tables l and 3.) Large 
changes, however, appear to propagate only a short distance 
along the chain and all but disappear within a couple of 
residues or after the first loop in the FR region. 
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Tabljs 3. The values of the dihedral angle (in 
degrees) and of the C a -C 0 ' distance (in A) "of the cysteine 
residue in various species a . 





<VC P 


Cp-S 


S-S ' 


S'-Cp' 




c«-c £ 


SI (V„44 


-V L 105) : 












S44 


-48.4 


-143.0 


93.9 


-89 .6 


-76.2 


4.66 


S44, 111 


-41.9 


-150.2 


95.9 


-87.1 


-76.8 


4.76 


y44 


-49.1 


-142.3 


93.7 


-87.5 


-76.4 


4.59 


y44, ill 


-49.3 


-138.8 


92.4 


-93.1 


-73.8 


4.63 



£2_IV L 48_£V H 3JJJ_: 

sill 35.0 179.5 68.5 -90.9 -74.1 4.99 

S44,lll 34.1 179.6 68.2 -91.0 -73.7 5.01 

ylll -31.6 -156.7 104.1 -66.6 -90.8 4.71 

y44,lll -32.9 -157.0 104.5 -67.5 -89.8 4.73 



Literature" : 

Class 3 71(9) -166(13) 103(2) -78(5) -62(8) 5.00 

class 6 -55(3) -121(11) 101(3) -83(4) -53(7) 4.18 



^he first five columns of numbers are the dihedral angles for 
N-C a -C p -S, C a -C p -S-S', Cp-S-S'-Cp', S-S'-Cp'-C a ' , and S-C p «- 
C a '-N', in the direction of V H 44 to V L 105 for the SI site and 
in the direction of V L 48 to V H lll for the S2 site. 
b From Katz et al., infra. The quoted values are averages over 
4 examples for class 3 and 8 examples for class 6, each with 
the standard deviation in parentheses. 
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10. _ Energies of the disulfide bonded models. 

The stability of any of these mutants is difficult 
to estimate because of the lack of structural information of 
the corresponding unfolded forms. The various energy terms of 
the fully refined models are listed in Table 2. In 
considering these energy terms, one should bear in mind that 
the precise values are subject to the inherent uncertainties 
associated with the empirical potential energy functions and 
to the errors introduced by neglecting the solvent. These 
figures are meant to be used for qualitative considerations 
only . 

Comparing first the energies of sites SI and S2 of 
species B3 Fv and B3 yFv, it can be seen that the SI site has 
a substantially lower energy than the S2 site before the 
mutation. This means that, if the mutated forms had the same 
energy, mutating the SI site will be energetically more costly 
than mutating the S2 site. These energy values are, however, 
especially unreliable because the residues involved before the 
mutation are Arg, Ser, and Gin, which are all highly polar, 
and the energy value will be sensitively affected by the 
absence of the solvent. 

On the other hand, the internal energy of the 
cysteine residue present at SI after the mutation is about 6 
kcal/mole lower than that present at S2, both in the singly 
and in the doubly disulfide bonded species. This is true 
whether the V H 95 is Ser or Tyr. Although this is a small 
energy difference, this calculation should be more reliable 
since it involves one covalently bonded moiety with no formal 
charge. Examination of the detailed composition of this 
energy difference indicates that most of it arises from the 
difference in the energy of the bond angle, which accounts for 
3 - 4 kcal/mole, and from that of the torsion angle, which 
accounts for 1 - 2 kcal/mole. This indicates that the 
disulfide bond at S2 is slightly more strained than that at 
SI. 

The interaction energy with the rest of the molecule 
rises by about 40 kcal/mole for site SI and by about 30 
kcal/mole for site S2, favoring S2 . There is a much larger 
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change in the .energy of the rest of the molecule at sites 
other than SI and S2 , which implies that a conformational 
change occurs in this part of the molecule. However, a 
detailed examination of the structural changes and various 
energy components indicates that only a minor part of these 
differences can be traced to be a direct result of the 
introduction of the disulfide bond. The major part of the 
difference appears to be due to natural flexibility of the 
molecule at the exposed loops, coupled with the fact that the 
computed energy values are sensitive to small changes in the 
position of charged, flexible side chain atoms. In general, 
however, it appears that the energy of the molecule increases 
upon introduction of a disulfide bond and that it rises 
proportionately more when two disulfide bonds are formed. The 
magnitude of the rise per disulfide bond is comparable to that 
of the SI site, i.e. the energy change upon converting an Arg 
and Ser to two Cys. It can also be noted that the V H -V L 
interaction energy generally increases in magnitude upon the 
Tyr mutation at V H 95. 

11. Geometries of the disulfide bonded models. 
All disulfide bonds are found to be right-handed 

(Table 3) . The cysteine residue formed at site SI is 
approximately related to that formed at site S2 by the pseudo 
two- fold symmetry of the molecule. However, their detailed 
geometries (Table 3) indicate that they fall into two types. 
All but two of the eight cysteine- residues are of one type 

(type A) while the remaining two, the one at S2 in species 
sill and s44,lll, are of a different type (type B) . Katz et 
a!., J". Biol. Chem. 261:15480-15485 (1986), incorporated by 
reference herein, surveyed the conformation of cysteine 
residues in known protein crystal structures and classified 
the right-handed forms into six different classes. The two 
types found in our models do not exactly fit into any of these 
classes. The dihedral angle values of two classes that fit 
the modeled geometry best are also included in Table 3. Class 
6, with 8 examples, represents the most common geometry for 
the right-handed cysteine residues found in other protein 
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structures. Th£ internal dihedral angles of the disulfide 
bonds at site SI are rather close to those in this class. On 
the other hand, the disulfide bonds at site S2 have internal 
dihedral angles that deviate much from their closest classes 
5 (class 6 for type A bonds in the ylll and y44,lll species and 
class 3 for the type B bonds in the sill and s44,lll species) . 

The large deviation of Type B geometry from that of 
other disulfide bonds is probably related to the existence of 
the cavity near the S2 site in B3 (Fv) at the bottom of which 

10 is V H 95 serine residue. The new disulfide bond is at the side 
of this cavity and the Cp atom of V L 48 residue is pulled in 
toward this cavity. The large deviation of the C a -Cp-S-S' and 
Cp-S-S'-Cp dihedral angles of type B from those of others in 
class 3 is related to this distortion of the main- chain. The 

15 Tyr mutation at V H 95 fills this cavity with the Tyr side chain 
and appears to restore the main- chain distortion and to change 
the geometry of the cysteine residue from type B to type A. 
Even after the mutation, however, the geometry of the 
disulfide bond at S2 site deviates more from the class 6 

20 geometry than that at SI site. 

The main- chain dihedral angle values (Table 4) 
indicate that mutation at SI has no effect on the geometry of 
the main- chain at S2 and vice versa. Large angle changes are 
restricted to the mutated residue in the heavy chain. The 

25 sole exception is the 30° change in the ^ angle of V H 110 for 
the Sill and s44,lll species, a feature probably related to 
the existence of the cavity near S2 in these species. The Tyr 
mutation at V H 95 changes this and other main- chain dihedral 
angles at S2 (<*> and * of V L 48 and * of V H 110 and V H lii) . 

30 

12. Modeling- Conclusion, 

It is well known that each of the heavy and light 
chains of the Fv fragment forms a nine- stranded beta-barrel 
and that the interface between the heavy and light chains that 
35 forms at the center of the molecule is also barrel -shaped 

(Richardson, Adv. Prot. Chem. 34:167-339 (1981)). One side of 
this central barrel is made of four strands from the heavy 
chain while the other side is made of four strands from the 
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light chain. These two sides join each other. around the 
barrel at two sites, which are related by the approximate two- 
fold symmetry that runs along the axis of the barrel (Davies 
et al., Ann. Rev. Biochem. 44:639-667 (1975)). At each site, 
a stretch of the 04 strand of one chain (V H 44-47 or V L 48-51 
for B3 Fv) is next to, and runs antiparallel to, a stretch of 
the 09 strand of the other chain (V L 105-101 or V H 111-107 for 
B3 Fv) . In the modeled structure of B3 Fv, and probably in 
the Fv of all immunoglobulins, the closest inter- chain 
contacts between the mainchain atoms in the FR region occur 
either within these stretches or at the immediate fringes of 
these stretches (Table 1) . Since the C a -C a distance of a 
cysteine residue in known protein structure ranges from 4.2 to 
6.6 A (Katz et al., J. Biol. Chem. 261:15480-15485 (1986)), it 
is improbably that an interchain disulfide bond can be formed 
in the FR region outside of these sites, without introducing 
large, damaging distortions to the molecule. 

The two possible disulfide bonding sites studied in 
this report at the shortest contact points in each of these 
sites (Table 1). The disulfide bonds at V H 44-V L 106, V H 112- 
V L 48, and V H lll-V L 47 are also good sites. Other pairs with 
short C a -C a distances are less preferable since they are 
closer to the CDR loops in the three-dimensional structure and 
therefore more likely to disturb the antigen binding function 
of the molecule. 

However, both of the sites they used for McPC603 
V H 108-V L 55 and V H 106-V L 56 involved residues in the CDR region 
and obviously were not the two sites that we identified. 
These sites correspond to V H 105-V L 54 and V H 103-V L 55 of B3 and 
are at the extreme CDR end of the 04/09 strands, at the 
opposite end of which lies the S2 sites of V H 1H-V L 48. This 
difference results at least in part from the different 
strategy used to search for the potential disulfide bond 
sites: they searched for interchain residue pairs, neither of 
which was Pro, and all of whose main- chain atoms were arranged 
in a geometry similar (within 2 A in rms) to that of a 
cysteine residue in a list of all such residues in known 
protein structures. They avoided the residues directly 
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involved in tlie hapten binding, but otherwise allowed them to 
be in the CDR region. In contrast, we searched for sites 
strictly in the FR region only, while relaxing on the 
constraints on geometry by requiring only that their C a to C a 
5 distance be short. We reasoned that a distortion at the site 
of mutation was inevitable and that an insistence on a 
similarity of the whole main- chain before the disulfide bond 
formation was probably too restrictive. 

The calculated main- chain dihedral angle values 

10 (Table 4) indicate that disulfide bonds can be formed at these 
sites without a large change in the internal geometry of the 
main- chain. The calculated main- chain dihedral angle values 
(Table 4) indicate that disulfide bonds can be formed without 
a large change in the internal geometry of the main- chain. In 

15 particular, the changes in the main- chain dihedral angles of 
the flanking Gly residues, which we initially thought would 
help absorb some of the distortions, are small. The internal 
geometries of the cysteine residues formed (Table 3) appear to 
be close to the geometries of other cysteine residues in known 

20 protein structures, at least at one of the two sites. The 

calculated energy values must be used with caution because of 
the inherent uncertainties associated with the empirical 
potential function used, because the solvent was not included 
in the calculation, and because the calculation is possible 

25 only for the folded form whereas what is needed is the 
difference between the folded and unfolded forms. The 
calculations nevertheless indicate (Table 2) that the 
energetic cost for introducing a disulfide bond at the two 
sites will be basically that of converting the character of 

30 two- residue 1 s worth of the protein surface from charged to 

non-polar. All of these indicated to us that introduction of 
a disulfide bond at one of these two sites would be possible. 

The main- chain geometries and the internal 
geometries of the cysteine residue, as well as the V H -V L 

35 interaction energies, indicate that the Ser to Tyr mutation at 
V H 95 is likely to be beneficial. The energetic considerations 
indicate that the species y44 and ylll would be roughly 
equally suitable and preferable over the double disulfide 
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bonded species.. Finally, the comparison of the internal 
geometry of the cysteine residue with that of others in known 
protein structures gives a slight edge for the y44 species 
over the ylll. 
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Table 4. The main chain dihedral, angles, (first 
angle) and \f/, in degrees, of indicated residues in various 
species of B3 Fv. 



B3 Fv 
B3 yFv 
sill 
ylll 



V H 44 (R.C) 
-91.5 -164.5 
-91.5 -165.2 
-89.1 -168.0 
-93.9 -165.6 



Yl 104 (G) V L 105 (S.C) 

-69.6 172.7 -80.5 -10.6 

-69.2 172.4 -79.4 -11.6 

-70.5 171.8 -79.2 -13.3 

-69.8 172.4 -79.6 -11.3 



V L 106 (G) 
94.0 119.8 
95.0 119.1 

94.3 121.3 

94.4 120.0 



S44 

S44, 111 
y44 

y44, 111 



-134.4 -173.1 
-109.8 -169.8 
-128.2 -173.2 
-141.0 -175.1 



■85.8 164.8 

•86.5 163.5 

•85.7 167.2 

86.6 162.1 



-87.9 -1.0 

-88.0 1.7 

-89.0 -2.6 

-85.0 -3.6 



104.5 137.2 
102.1 141.6 
104.8 135.1 
106.8 137.9 



B3 Fv 
B3 yFv 
S44 
y44 



V L 48 (S.C) 
-84.5 154.7 
-85.8 146.7 
-85.0 153.8 
-85.7 145.5 



V H 11Q (G) V H 111 (O.C) V n 112 (G) 

-86.4 -144.9 -106.4 -43.8 111.2 141.6 
-85.5 -141.6 -108.8 -45.1 115.6 142.1 
-87.0 -145.1 -107.0 -44.4 111.7 142.1 
-86.4 -144.3 -117.2 -42.5 118.2 138.0 



sill 
S44,lll 
ylll 
y44, in 



-88.6 151.4 
-89.9 151.4 
-79.0 131.7 
-80.0 132.9 



-88.7 -172.7 -130.6 -5.4 116.2 138.6 

-88.7 -171.6 -131.0 -5.6 115.7 139.0 

-86.8 -149.1 -135.3 -17.0 114.1 136.0 

-87.1 -151.1 -133.9 -16.4 113.0 135.9 



The fact that the disulfide bond sites found here 
are in the highly conserved framework region is significant. 
The Cys mutant at these sites is expected to work because the 
structure of the framework region is relatively similar from 
protein to protein. As a partial test of this expectation, wi 
have computed the C a -C a distances at these sites using the 
crystal structures for all known immunoglobulin Fv regions. 
These data (Table 5) indicate that, while there are 
variations, the C a -C 0 distances are indeed suitably short for 
formation of a disulfide bond at at least one of the sites in 
all the proteins including some from the human source. These 
sites can be found for any immunoglobulin simply from the 
sequence alignment without the need for computer modeling or 
structural information. 
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C a -C a distances {.in Angstroms) 



10 



15 



20 



Table 5 . The 
between residue pairs in immunoglobulins* at positions 
homologous to those of V H 44-V L 105 and V H 111-V L 48 in B3 



B3 model 

1MCP 

2FB4 

2FBJ 

2IG2 

3 FAB 
1FAI 
2F19 
1FDL 
1IGF 
2HFL 
3HFM 

4 FAB 
6 FAB 



V H 44R 
V H 44R 
V H 44G 
V H 44G- 
V H 44G- 
V H 44G- 
V H 44G- 
V H 44G- 
V H 44G- 
V H 44R- 
V H 44G- 
V H 44R- 
V H 44G- 
V H 44G- 



V L 105S 

V L 106A 

V L 101T 

V L 99A 

V L 101T 

V L 101G 

V L 100G 

V L 100G 

V L 100G 

V L 100G 

V L 98G 

V L 100G 

V L 105G 

V L 100G 



5.6 
5.6 
6.0 
5.8 
5.9 
5.3 
4.4 
4.1 
5.4 
5.9 
4.6 
6.4 
6.8 
5.2 



V H 111Q- 


■V L 48S 


5 


.6 


V H 114A- 


■V T 49P 


5 


. 7 


V H 110Q- 


•V L 42A 


5 


.4 


V H 110Q- 


•V L 42S 


5 


.8 


V H 111Q- 


•V L 42A 


4 


.9 


V H 109Q- 


■V L 42A 


6 


.0 


V H 116Q- 


■V L 43T 


6 


.4 


V H 116Q- 


■V L 43T 


5 


.6 


V H 108Q- 


V L 43S 


5 


.6 


V H 115Q- 


V L 43S 


6 


.3 


V H 108Q- 


V L 42S 


5 


.8 


V H 105Q- 


V L 43S 


6 


.0 


V H 110Q- 


V L 48S 


5 


.3 


V H 113Q- 


V L 43T 


6 


.2 



^he immunoglobulins are identified by the Bookhaven Data Bank 
file names (Abola et al . , supra) . All are from the mouse 
except three (2FB4, 2IG2, and 3 FAB) which are from the human. 
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II. Production of a B3 (dsFv) iaamunatoxin . 

B3 (dsFv) -PE38KDEL is a recombinant immunotoxin composed 
of the Fv region of MAb B3 connected to a truncated form of 
Pseudomonas exotoxin (PE38KDEL) , in which the V H -V L are held 
together and stabilized by a disulfide bond. 

A. Construction of plasmids for expression of 
B3 (dsPv) -immuno toxins . 

The parent plasmid for the generation of plasmids 
for expression of ds (Fv) - immunotoxins, in which V H arg44 and 
V L serl05 are replaced by cysteines, encodes the single -chain 
immunotoxin B3 (Fv) -PE38KDEL (tyrH95 ) . In this molecule the V„ 
and V L domain of MAb B3 are held together by a (gly4ser)3 
peptide linker (B3scFv) and then fused to the PE38KDEL gene 
encoding the translocation and ADP-ribosylation elements of ; 
Pseudomonas exotoxin (PE) (Brinkmann et al., Proc. Natl. Acad. 
Sc±. USA 89:5867-5871 (1991) (Brinkmann I); Hwang et al., Cell 
48:129-136 (1987), both of which are incorporated by reference 
herein). B3 (Fv) -PE38KDEL (tyrH95) is identical to B3 (Fv) - 
PE38KDEL (Brinkmann I, supra) except for a change of serine 95 
of B3(V H ) (position V H 9l according to Rabat et al . ) , to 
tyrosine. This tyrosine residue is conserved in the framework 
of most murine V H domains and fills a cavity in the V H -V L 
interface, probably contributing to V H -V L domain interactions. 
We have compared the properties of B3 (Fv) -PE3 8KDEL and B3 (Fv) - 
PE3 8KDEL ( tyrH95 ) , including ability to be renatured, behavior 
during purification, and cytotoxic activity towards carcinoma 
cell lines, and found them to be indistinguishable. 

The plasmids for expression of the components of 
ds (Fv) -immunotoxins, B3 (V H cys44) and B3 (V L cysl05) -PE38KDEL 
were made by site -directed mutagenesis using uridine 
containing single -stranded DNA derived from the F+ origin in 
PULI28 as template to mutate arg44 in B3 (V H ) and serl05 in 
B3(V L ) to cysteines (Kunkel, Proc. Natl. Acad. Sci. USA 
82:488-492 (1985)), see below for sequences of the mutagenic 
oligonucleotides. The final plasmids pYR38-2 for expression 
of B3(V H cys44) and pULI39 for B3 (V L cysl05) -PE38KDEL were made 
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by subcloning.f rom the mutagenized plasmids. .Details of the 
cloning strategy are shown in Fig. 2. 

Plasmid constructions : 

Uracil -containing single stranded DNA from the F+ 
origin present in our expression plasmids was obtained by 
cotransf ection with M13 helper phase and was used as template 
for site directed mutagenesis as previously described (Kunkel, 
T.A. , Proc. Natl. Acad. Sci. USA 82:488-492 (1985)). The 
complete nucleotide sequence of B3 (Fv) has been described 
before (Brinkmann I, supra) . The mutagenic oligonucleotides 
were 

5 1 -TATGCGACCCACTCSAGACACTTCTCTGGAGTCT-3 1 (Seq. ID No. 5) to 
change arg44 of B3 (V H ) to cys, 5 1 - 

TTTC CAG CTTTGTCCCACAGC CGAACGTGAATGG - 3 1 ( Seq . ID No . 6 ) to 
replace serlOS of B3 (V L ) with cys, and 

5 ' - CCGCCACCACCGGATCCGCGM22SATTAGGAGACAGTGACCAGAGTC - 3 ' ( Seq . 
ID No. 7) to introduce stop codons followed by an EcoRI site 
at the 3»-end of the B3 (V H ) gene. Restriction sites (Xhol and 
EcoRI) introduced into these oligonucleotides to facilitate 
identification of mutated clones or subcloning are underlined. 
The oligonucleotides 

5 ' TCGGTTGGAAACTTTGCAGATCAGGAGCTTTGGAGAC3 1 ( Seq . ID No . 8 ) , 
5 ' TCGGTTGGAAACG CAGTAGATCAGAAGCTTTGGAGAC 3 » ( Seq . ID No . 9 ) , 
5 » AGTAAGCAAACCAGGCGGAXTCAGGCCAGTC^ » (Seq. 

ID No. 10) , and 

5 » AGTAAGCAAAAC^GGCTCCCCAGGCCAGTCCTCTTGCGCAGTAATATATGGC3 1 ( Seq . 
ID No. 11) were used to introduce cysteines at V L 54, V L 55, 
V H 103 and V H 105 of B3 (Fv) , which correspond to the positions 
v l 55 ' v l 56 ' V H 106 and V H 108 of the described disulfide- 
stabilized McPC603 Fv (Glockshuber et al . , supra; see Table 
7) . All mutated clones were confirmed to be correct by DNA 
sequencing. The B3 (V L cysl05) mutation was subcloned into a 
B3 (V L ) -PE38KDEL immunotoxin coding vector by standard 
techniques according to Sambrook et al . , Molecular Cloning: A 
Laboratory Manual (2nd ed.), Vols. 1-3, Cold Spring Harbor 
Laboratory (1989), incorporated by reference herein (see also 
Fig. 2) . 
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B. Expression in inclusion bodies, refolding and 
purification . 

B3 (Pv) -PE38KDEL, B3 (Fv) cysH44L105 - PE3 8KDEL, 
B3 (V L cysl05) -PE38KDEL and B3(V H cys44) were produced in 
separate E. coll BL21 XDE3 cultures containing pULI9, pULI37, 
pULI39 or pYR38-2 respectively, essentially as described 
(Brinkmann I, supra) . 

To produce recombinant B3 (dsFv) -immunotoxins, 
separate E. coll BL21 (XDE3) cultures containing either the 
B3(V H cys44) encoding plasmid pYR38-2 or the B3 (V L cysl05) - 
PE38KDEL encoding plasmid pULI39 were induced with IPTG, upon 
which the recombinant proteins accumulated to 20-30% of the 
total protein in intracellular inclusion bodies (IBs) . Active 
immunotoxins were obtained after the IBs were isolated 
separately, solubilized, reduced and refolded in renaturat ion- 
buffer containing redox- shuf fling and aggregation preventing 
additives. The refolding for dsFv was performed as previously 
described for the preparation of single -chain immunotoxins 
(Buchner et al . , Anal. Biochem. 205:263-270 (1992), 
incorporated by reference herein) with two modifications: (i) 
Instead of adding only one solubilized and reduced protein 
(e.g. B3 (Pv) PE38KDEL) to the refolding solution, we prepared 
IBs containing V H cys44 or V L cysl05- toxin separately and mixed" 
them in a 2 (V H ) : 1 (V L - toxin) molar ratio to a final total 
protein concentration of 100 /*g/ml in the refolding buffer. 
We found that a 2-5 fold excess of V H over the V L - toxin gave 
the best yield of renatured immunotoxin. Equal molar addition 
of V H and V L - toxin into the renaturation solution or a >5 fold 
excess of V H resulted in a reduction of the yield of active 
monomeric immunotoxin; with too much V H we observed increased 
aggregation. (ii) A "final oxidation" step in which excess 
oxidized glutathione was added to the refolding solution after 
the redox- shuf fling was completed. This oxidation increased 
the yield of properly folded functional protein by at least 
five -fold, probably because the disulfide bond connecting V H 
and V L is exposed on the surface of the Pv and is accessible 
to the slight reducing conditions in the refolding buffer and 
would remain reduced without "final oxidation." 
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To recover active immunotoxins after refolding, we 
adapted the purification scheme established for scFv- 
immunotoxins (Brinkmann J, supra.; Brinkmann et al., J. 
Immunol. 150:2774-2782 (1993) (Brinkmann II), incorporated by 
5 reference herein; Buchner et al . , supra), which is ion- 
exchange chromatography (Q-sepharose and MonoQ columns) 
followed by size exclusion chromatography. Properly folded 
(dsPv) - immunotoxins have not only to be separated from 
aggregates, which separate easily, but also from "single- 

10 domain" V L - toxins which have a chromatographic behavior close 
to (dsFv) -immunotoxins (Brinkmann II, supra). After refolding 
of B3 (dsFv) -PE38KDEL, the MonoQ "monomer peak" contains two 
proteins; the dsFv-immuno toxin elutes slightly earlier than 
the V L -toxin. We purified B3 (dsFv) -PE38KDEL to near 

15 homogeneity by consecutive cycles of chromatography, pooling 
early fractions, rechromatographing peak fractions and 
discarding late fractions. Despite significant losses of 
active dsFv- immuno toxin (discarded "late" fractions still 
contain dsFv-protein) , this procedure is efficient enough to 

20 obtain >8 mg pure dsFv-immunotoxin from 1 liter each of 

bacterial V H and V L - toxin cultures and we expect to greatly 
increase this yield by modifying our purification conditions. 

III. Specific toxicity of B3 (dsFv) -PE38KDEL towards B3- 
25 antigen expressing carcinoma cell lines. 

The activity of different immunotoxins (IC 50 in 
ng/ml) towards carcinoma cell lines was determined as 
described in Tables 6 and 7. B3 (scdsFv) -PE38KDEL molecules 
are single -chain immunotoxins which in addition to the 

30 (gly 4 ser) 3 linker have cysteines introduced in V H and V L to 
form an interchain disulfide. V H 44-V L 105 corresponds to 
B3 (dsPv) -PE38KDEL, except that in B3 (dsFv) the linker peptide 
is deleted. V H 105-V L 54 and V H 103-V L 55 are the positions where 
cysteine residues were introduced in the previously described 

35 "custom-made" V H 108-V L 55 and V H 106-V L 56 disulfide bonded 
McPC603 (Fv) (Glockshuber et al . , supra). 
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TABLE 6 

Cytotoxicity of recombinant B3 - immuno toxins towards 

different cell lines 



B3 antigen 


Cytotoxicity in ng/ml 


Cell Line 


Cancer 
Type 


B3-Ag 


B3 (Fv) - 
PE38KDEL 


B3 (dsFv) - 
PE38KDEL 


MCF7 


Breast 


+++ 


0.25 


0.25 


A431 


Epidermoid 


+++ 


0.3 


0.35 


LNCaP 


Prostate 


+ 


9 


8.5 


HTB103 


Gastric 


+ 


3.5 


3.5 


HUT- 102 


Leukemia 




>1000 


>1000 


♦Estimated by immunofluorescence usii 


ig MAb B3. 





Cytotoxicity assays were performed by measuring 
incorporation of 3 H- leucine into cell protein as previously 
described (Brinkmann et al . , Proc. Natl. Acad. Scl . USA 
88:8616-8620 (1991) (Brinkmann I) , incorporated by reference 
herein) . IC 50 is concentration of immunotoxin that causes a 
50% inhibition of protein synthesis following a 16 hour 
incubation with immunotoxin. 

A comparison of Fv-mediated specific cytotoxicity of 
a single-chain immunotoxin B3 (Fv) -PE3 8KDEL and the 
corresponding disulfide- stabilized B3 (dsFv) -PE38KDEL shows 
that both proteins recognize the same spectrum of cells and 
are equally active (Fig. 3, Tables 6 and 7). B3 (dsFv) - 
PE38KDEL like B3 (Fv) -PE38KDEL only is cytotoxic to B3-antigen 
expressing cells and has no effect towards cells which do not 
bind MAb B3 (e.g., HUT102) . The addition of excess MAb B3, 
but not an excess of HB21, an antibody to the human 
transferrin receptor, can compete with this cytotoxicity, 
confirming that the activity of B3 (dsFv) -PE38KDEL is due to 
specific binding to the B3 -antigen (Fig. 3C) . In this 
competition experiment, excess MAb B3 or HB21 (to a final 
concentration of 1 mg/ml) was added 15 min before addition of 
toxin. A high concentration of MAb B3 is necessary for 
competition because of the large amount of B3- antigen present 



WO 94/29350 



PCT/US94/06687 



42 

on carcinoma cells (Brinkmann I, supra; Brinkmann II, supra; 
Pai et al., Proc. Natl. Acad. Sci . USA 88:3358-3362 (1991)). 
The finding that the specificity and activity of scFv- and 
dsFv-immuno toxins are indistinguishable indicates that the 
binding region is conserved equally well in the disulfide- 
stabilized B3 (Fv) and in the linker stabilized molecule. 

TABLE 7 



Placement of the disulfide bond connecting V H and V L 
at different positions of B3 (Fv) 





PE38KDEL fusion protein 


Cell Line 


B3(Fv) 


B3(dsFv) 


B3(scdsFv) 
H44-L105 


B3(scdsFv) 
H105-L55 


B3(scdsFv) 
H103-L56 


A431 


0.3 


0.3 


0.4 


80 


250 


MCF7 


0.25 


0.25 


0.3 


90 


200 



IV. Stability of B3 (Fv) - and B3 (dsFv) -PE38KDEL in human 
serum. 

Because dsFv- and scFv- immunotoxins have equal 
activity towards cultured carcinoma cells, B3 (dsFv) -PE3 8KDEL 
should also be useful for cancer treatment like its scFv 
counterpart, B3 (Fv) -PE38KDEL (Brinkmann I, supra). One factor 
that contributes to the therapeutic usefulness of immunotoxins 
is their stability. The stability of Fv- immunotoxins was 
determined by incubating them at a concentration of 10 fig /ml 
at 37°C in human serum. Active immunotoxin remaining after 
different lengths of incubation was determined by cytotoxicity 
assays on A431 cells. Table 8 shows a comparison of the 
stability of scFv- and dsFv- immunotoxins in human serum. The 
scFv- toxin B3 (Fv) -PE38KDEL is stable for one to two hours and 
then begins to lose activity. In marked contrast, the dsFv- 
toxin B3 (dsFv) -PE38KDEL retains full cytotoxic activity for 
more than 24 hours. 
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TABLE 8 

Stability of B3 (Fv) -PE38KDEL and 
B3 (dsFv) -PE38KDEL in human serum 







% 


activity 


left 










Hours 


0 


0.5 


1 


2 


4 


8 


12 


24 


Sample 










ScFv in Serum 


1 


100 


100 


87 


50 


31 


14 


14 


1 


scFv in serum 


2 


100 


88 


58 


35 


20 


6 


4 


1 


dsPv in serum 


1 


100 


100 


100 


100 


100 


100 


100 


100 


dsFv in serum 


2 


100 


100 


100 


100 


100 


100 


100 


100 



Each type of immunotoxin was incubated at 10 /*g with 
human serum at 37°C for the times shown and then assayed for 
cytotoxic activity on A431 cells, 

V. Immunotoxin e23 (Pvds) -PE38KDEL. 

MAb e23 is an antibody directed against the erbB2 
antigen which is present on many human carcinomas. e23 (Fv) - 
PE40 is a single chain immunotoxin composed from the single - 
chain Fv of e23 which V L is connected by peptide linker to V H 
which in turn is fused to a truncated form of Pseudomonas 
exotoxin (PE40) . e23(Fv)PE40 has been shown to be of 
potential use in cancer therapy (Batra et al . , Proc. Natl. 
Acad. Sci. USA 89:5867-5871 (1992)). e23 (Fv) -PE3 8KDEL is a 
single chain derivative of e23(Fv)-PE40 in which the toxin 
part of the immunotoxin is PE3 8KDEL instead of PE40 which 
results in improved activity. 

A. Position of the disulfide. 

The Fv region of e23 can be stabilized by a 
disulfide bond in the same manner as described for B3(Fv) 
above. We made the immunotoxin e23 (dsFv) -PE38KDEL which 
corresponds in its composition to e23 (scFv) -PE38KDEL, except 
that it has the peptide linker between V L and V H omitted and 
replaced by a disulfide bond. The positions that we used for 
introduction of the disulfide are corresponding to position 
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V H 44-V L 100 according to Kabat and Wu, and position V H asn43 
and V L gly 99 in the actual e23 sequence, see Figure 4. 

B. Plasmid constructions. 

The replacement of framework residues by cysteines, 
deletion of the linker peptide and construction of plasmids 
for separate expression of the components of the e23 (dsFv) 
immunotoxin was done by standard mutagenesis and cloning 
techniques as described in the example above. Mutagenic 
oligonucleotides that were used for replacement of V H asn 43 
and V L gly 99 with cysteines were 

5 1 - AGTCCAATCCACTCGAGGCACTTTCCATGGCTCTGC - 3 1 ( Seq . ID No. 12 ) 
(V H ) and 

5 1 - TATTTCCAGCTTGGACCCACATCCGAACGTGGGTGrG - 3 1 ( Seq . ID No. 13) 
(V L ) , stop codon at the end of the V L was introduced by the 
primer 

5 1 - AG AAG ATTT A C CAG AAC CAGG AATT CATT ATTTT ATTT C CAG CTTGG AC C -3 » (Seq. 
ID No. 14) . Details of the plasmid constructions are 
described in Figure 5. Note, that in contrast to B3 (Fv) - 
immuno toxins, the toxin portion of e23 (Fv) - immuno toxins, 
e23(scFv) and e23 (dsFv) -PE38KDEL is fused to the V H and not to 
the V L domain of the Fv. 

C. Production of e23 (dsFv) -PE38KDEL. 

The components of e23 (dsFv) -PE38KDEL, which are 
e23 (V L cys99) and e23 (V H cys43) -PE38KDEL were expressed 
separately in E. coli in inclusion bodies which were isolated 
and refolded as described above. Active proteins were 
isolated by ion exchange and size exclusion chromatography 
essentially described above. We found, however, that in 
contrast to purification of B3 (dsFv) - immunotoxins, the 
preparation did not contain as much contaminating "single 
domain" immunotoxins. This is because in the B3 (dsFv) - 
immunotoxin example, the toxin is fused to V L , while in the 
e23dsFv immunotoxin the toxin is fused to V H . It has been 
described, that single domain V L - toxins are much more soluble 
than V H - toxins, which strongly tend to aggregate. Because of 
that, in the B3 (dsFv) example, soluble V L -toxin molecules can 



WO 94/29350 



PCT/US94/06687 



45 

severely contaminate the dsFv- immuno toxin preparation, while 
in the e2 3 (dsFv) -example the contaminating V H -toxins aggregate 
and precipitate, and thus can be easily removed from the dsFv- 
immunotoxin . 

5 

D. Comparison of scPv and dsFv of e23 . 

As described above, specific cytotoxicity of Fv- 
immunotoxins can be used to assess the specific binding of the 
Pv portion of the immunotoxin. The comparison of the specific 
10 cytotoxicity of scFv and dsFv-immunotoxins derived from MAb 
e23 on cells that have erbB2 on their surface are listed in 
Table 9 (See Table 6 and related discussions for protocol 
details) . The dsFv- immuno toxin of e23 is at least as active 
and even might be slightly more active than the scPv 
15 counterpart. Thus, the specific binding of the dsFv of e23 to 
erbB2 is the same or superior to e23 (scPv) . 

Table 9 

20 Cell -Line Cancer e23 (scFv) PE38KDEL e23 (dsFv) PE38KDEL 

N87 gastric 0.2 ng/ml 0.06 ng/ml 

HTB20 breast 0.075 ng/ml 0.06 ng/ml 

25 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: PASTAN , Ira 

LEE, Byungkook 
JUNG, Sun-Hee 
BRINKMANN, Ulrich 

(ii) TITLE OF INVENTION: Recombinant Disulf ide-Stabilized 
Polypeptide Fragments Having Binding Specificity 

(iii) NUMBER OF SEQUENCES: 14 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Townsend and Townsend Khourie and Crew 

5 ??55 ET: Steuart Street Tower, one Market P?aza 

(C) CITY: san Francisco 

(D) STATE: California 

(E) COUNTRY: US 

(F) ZIP: 94105-1493 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #i.o, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 

(B) FILING DATE: 14-JUN-1993 

(C) CLASSIFICATION: 

(Viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Weber, Ellen L. 

(B) REGISTRATION NUMBER: 32,762 

(C) REFERENCE/ DOCKET NUMBER: 15280-152 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (415) 543-9600 

(B) TELEFAX: (415) 543-5043 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 118 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(ix) FEATURE: 

(A) NAME /KEY : Region 

(B) LOCATION: 1..3 0 

(D) OTHER INFORMATION: /label= FR1 
/note= "Framework region 1" 

(ix) FEATURE: 

(A) NAME /KEY : Region 

(B) LOCATION: 31.. 35 

(D) OTHER INFORMATION: /label= CDR1 

/note= "Compelementarity Determining Region 1" 

(ix) FEATURE: 

(A) NAME /KEY: Region 

(B) LOCATION: 3 6.. 4 9 

(D) OTHER INFORMATION: /label= FR2 
/note= "Framework Region 2" 

(ix) FEATURE: 

(A) NAME /KEY : Region 

(B) LOCATION: 50.. 66 

(D) OTHER INFORMATION: /label= CDR2 

/note= "Complementarity Determining Region 2" 

(ix) FEATURE: 

(A) NAME /KEY: Modif ied-site 

(B) LOCATION: 44 

(D, OTHER ^FORMATION: /note= • .Residue that can be changed 
to cys for possible interchain disulfide bond." 

(ix) FEATURE: 

(A) NAME /KEY: Region 

(B) LOCATION: 67.. 100 

(D) OTHER INFORMATION: /label= FR3 
/note- "Framework Region 3" 

(ix) FEATURE: 

(A) NAME /KEY: Region 

(B) LOCATION: 101.. 108 

(D) OTHER INFORMATION: /label= CDR3 

/note- "Complementarity Determining Region 3" 

(ix) FEATURE: 

(A) NAME/KEY: Region 

(B) LOCATION: 109.. 118 

(D) OTHER INFORMATION: /label- FR4 
/note= "Framework Region 4" 

(ix) FEATURE: 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 111 

(D, OTHER INFORMATION: /note= "Residue that can be changed 
to Cys for possible interchain disulfide bond." 



f 
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(ix) FEATURE: 

(A) NAME /KEY : Modif ied-site 

(B) LOCATION: 95 

(D) OTHER INFORMATION : /note- "The Ser to Tyr mutation 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 




Thr Leu Val Thr Val Ser 
115 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 121 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(ix) FEATURE: 

(A) NAME/KEY: Region 

(B) LOCATION: 1..30 

(D) OTHER INFORMATION: /label= FR1 
/note= "Framework Region 1" 

(ix) FEATURE: 

(A) NAME/KEY: Region 

(B) LOCATION: 31.. 35 

(D) OTHER INFORMATION: /label= CDR1 

/note= "Complementarity Determining Region l" 
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(ix) FEATURE: 

(A) NAME /KEY: Region 

(B) LOCATION: 36.. 49 

(D) OTHER INFORMATION: /label= FR2 
/note= "Framework Region 2" 

(ix) FEATURE: 

(A) NAME /KEY: Modif ied-site 

(B) LOCATION: 44 

(D) OTHER INFORMATION: /not.- "Residue that can be changed 
to cys for possible interchain disulfide bond." 

(ix) FEATURE: 

(A) NAME/ KEY: Region 

(B) LOCATION: 50.. 68 

(D) OTHER INFORMATION: /label= CDR2 

/note= "Complementarity Determining Region 2" 

(ix) FEATURE: 

(A) NAME /KEY : Region 

(B) LOCATION: 69.. 103 

(D) OTHER INFORMATION: /label= FR3 
/note= "Framework Region 3" 

(ix) FEATURE: 

(A) NAME/KEY: Region 

(B) LOCATION: 104.. Ill 

(D) OTHER INFORMATION: /label= CDR3 

/note= "Complementarity Determining Region 3" 

(ix) FEATURE: 

(A) NAME /KEY: Region 

(B) LOCATION: 112.. 121 

(D) OTHER INFORMATION: /label= FR4 
/note= "Framework Region 4" 

(ix) FEATURE: 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 114 

(D) OTHER INFORMATION: /not.- "Residue that can be changed 
to cys for possible interchain disulfide bond." 

(ix) FEATURE: 

(A) NAME /KEY: Modif ied-site 

(B) LOCATION: 97 

(D) OTHER INFORMATION: /note- "The Ser to Tyr mutation 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Glu Val Lys Leu Val Glu Ser Gly Gly Gly Leu Val Gin Pro Gly Gly 
1 5 10 15 

Ser Leu Arg Leu Ser Cys Ala Thr Ser Gly Phe Thr Phe Ser Asp Phe 
20 25 30 
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Tyr Met Glu Trp Val Arg Gin Pro Pro Gly Lys Arg Leu Glu Trp lie 
35 40 45 

Ala Ala Ser Arg Asn Lys Gly Asn Lys Tyr Thr Thr Glu Tyr Ser Ala 
50 55 60 

Ser Val Lys Gly Arg Phe lie Val Ser Arg Asp Thr Ser Gin Ser lie 
65 70 75 80 

Leu Tyr Leu Gin Met Asn Ala Leu Arg Ala Glu Asp Thr Ala lie Tyr 

85 90 95 

Tyr cys Ala Arg Asn Tyr Tyr Gly Ser Thr Trp Tyr Phe Asp Val Trp 
100 105 110 

Gly Ala Gly Thr Thr Val Thr Val Ser 
115 120 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 112 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(ix) FEATURE: 

(A) NAME /KEY: Region 

(B) LOCATION: 1..23 

(D) OTHER INFORMATION: /label= FR1 
/note= "Framework Region 1" 

(ix) FEATURE: 

(A) NAME/KEY: Region 

(B) LOCATION: 24.. 3 9 

(D) OTHER INFORMATION: /label= CDR1 

/note= "Complementarity Determining Region l w 

(ix) FEATURE: 

(A) NAME /KEY : Region 

(B) LOCATION: 4 0.. 54 

(D) OTHER INFORMATION: /label= FR2 
/note- "Framwork Region 2" 

(ix) FEATURE: 

(A) NAME /KEY : Modif ied-site 

(B) LOCATION: 48 

(D) OTHER INFORMATION: /note= "Residue that can be changed 
to Cys for possible interchain disufide bond." 
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(ix) FEATURE: 

(A) NAME /KEY : Region 

(B) LOCATION: 55.. 61 

(D) OTHER INFORMATION: /label= CDR2 

/note= "Complementarity Determining Region 2" 

(ix) FEATURE: 

(A) NAME/KEY: Region 

(B) LOCATION: 62.. 93 

(D) OTHER INFORMATION: /label= FR3 
/note= "Framework Region 3" 

(ix) FEATURE: 

(A) NAME /KEY : Region 

(B) LOCATION: 94.. 102 

(D) OTHER INFORMATION: /label= CDR3 

/note= "Complementarity Determining Region 3" 

(ix) FEATURE: 

(A) NAME/KEY: Region 

(B) LOCATION: 103.. 112 

(D) OTHER INFORMATION: /label= FR4 
/note= "Framework Region 4" 

(ix) FEATURE: 

(A) NAME /KEY: Modif ied-site 

(B) LOCATION: 105 

(D) OTHER INFORMATION: /note= "Residue that can be changed 
to cys for possible interchain disulfide bond." 

(ix) FEATURE: 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 92 

(D) OTHER INFORMATION: /note= "The Ser to Tyr mutation 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 




Ser Arg Val Glu Ala Glu Asp Leu Gly Val Tyr Tyr Cys Phe Gin Gly 

85 90 95 
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Ser His Val Pro Phe Thr Phe Gly Ser Gly Thr Lys Leu Glu lie Lys 
100 105 110 



) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 113 amino acids 

(B) TYPE: amino acid 

(C) STRAND EONESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(ix) FEATURE: 

(A) NAME/KEY: Region 

(B) LOCATION: 1..23 

(D) OTHER INFORMATION : /label= FR1 
/note= "Framework Region 1" 

(ix) FEATURE: 

(A) NAME/KEY: Region 

(B) LOCATION: 24.. 40 

(D) OTHER INFORMATION: /label= CDR1 

/note= "Complementarity Determining Region" 

(ix) FEATURE: 

(A) NAME /KEY: Region 

(B) LOCATION: 41.. 55 

(D) OTHER INFORMATION: /label= FR2 
/note= "Framework Region 2** 

(ix) FEATURE: 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 4 9 

(D) OTHER INFORMATION: /note= "Residue that can be changed 
to Cys fro possible interchain disulfide bond." 

(ix) FEATURE: 

(A) NAME/KEY: Region 

(B) LOCATION: 56.. 62 

(D) OTHER INFORMATION: /label= CDR2 

/note* "Complementarity Determining Region 2" 

(ix) FEATURE: 

(A) NAME /KEY: Region 

(B) LOCATION: 63.. 94 

(D) OTHER INFORMATION: /label= FR3 
/note= "Framework Region 3" 

(ix) FEATURE: 

(A) NAME/KEY: Region 

(B) LOCATION: 95.. 103 

(D) OTHER INFORMATION: /label= CDR3 

/note= "Complementarity Determining Region 3" 
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(ix) FEATURE: 

(A) NAME/ KEY : Region 

(B) LOCATION: 104.. 113 

(D) OTHER INFORMATION: /label= FR4 
/note= "Framework Region 4" 

(ix) FEATURE: 

(A) NAME /KEY : Modif ied-site 

(B) LOCATION: 106 

(D) OTHER INFORMATION: /note- "Residue that can be changed 
to a Cys for possible interchain disulfide bond." 

(ix) FEATURE: 

(A) NAME /KEY: Modif ied-site 

(B) LOCATION: 93 

(D) OTHER INFORMATION: /note= "The Ser to Tyr mutation 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Asp He Val Met Thr Gin Ser Pro Ser Ser Leu Ser Val Ser Ala Gly 
1 5 10 15 

Glu Arg Val Thr Met Ser Cys Lys Ser Ser Gin Ser Leu Leu Asn Ser 
20 25 30 

Gly Asn Gin Lys Asn Phe Leu Ala Trp Tyr Gin Gin Lys Pro Gly Gin 
35 40 45 

Pro Pro Lys Leu Leu He Tyr Gly Ala Ser Thr Arg Glu Ser Gly Val 
50 55 60 

Pro Asp Arg Phe Thr Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Thr 
" 70 75 80 

He ser Ser Val Gin Ala Glu Asp Leu Ala Val Tyr Tyr Cys Gin Asn 

85 90 95 

Asp His Ser Tyr Pro Leu Thr Phe Gly Ala Gly Thr Lys Leu Glu He 
100 105 no 

Lys 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 4 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
TATGCGACCC ACTCGAGACA CTTCTCTGGA GTCT 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 4 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
TTTCCAG CTT TGTCCCACAG CCGAACGTGA ATGG 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
CCGCCACCAC CGGATCCGCG AATTCATTAG GAGACAGTGA CCAGAGTC 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
TCGGTTGGAA ACTTTGCAGA TCAGGAG CTT TGGAGAC 
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(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
TCGGTTGGAA ACG CAGTAG A TCAGAAGCTT TGGAGAC 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
AGTAAGCAAA CCAGGCGCAC CAGGCCAGTC CTCTTGCGCA GTAATATATG GC 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
AGTAAGCAAA ACAGGCTCCC CAGGCCAGTC CTCTTGCGCA GTAATATATG GC 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
AGTCCAATCC ACTCGAGGCA CTTTCCATGG CTCTGC 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
TATTTCCAGC TTGGACCCAC ATCCGAACGT GGGTGG 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 14: 
AGAAGATTTA CCAGAACCAG GAATTCATTA TTTTATTTCC AGCTTGGACC 
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WHAT IS CLAIMED IS : 

1. A polypeptide specifically binding a ligand, 
the polypeptide comprising a first variable region of a ligand 
5 binding moiety bound through a disulfide bond to a second 
separate variable region of the ligand binding moiety, the 
bond connecting framework regions of the first and second 
variable regions. 

10 2. The polypeptide of claim 1, wherein the 

polypeptide does not substantially contain any constant region 
of an antibody. 

3. The polypeptide of claim 1, wherein the 

15 polypeptide is conjugated to a radioisotope, an enzyme, a 
toxin, a chelating agent or a drug. 

4. The polypeptide of claim 1, wherein the 
polypeptide is recombinantly fused to a toxin, enzyme or other 

20 pharmaceutical agent. 

5. The polypeptide of claim 1, wherein the first 
variable region contains a cysteine at position 98, 99, 100, 
or 101 and the second variable region contains a cysteine at 

25 position 43, 44, 45, 46 or 47, such positions being determined 
in accordance with the numbering scheme published by Rabat and 
Wu, corresponding to a light chain and a heavy chain region, 
respectively, of an antibody. 

30 6. The polypeptide of claim 5 wherein the first 

variable region contains a cysteine at position 100 and the 
second variable region contains a cysteine at position 44. 

7. The polypeptide of claim 1, wherein the first 
35 variable region contains a cysteine at position 42, 43, 44, 45 
or 4 6 and the second variable region contains a cysteine at 
position 103, 104, 105, or 106, such positions being 
determined in accordance with the numbering scheme published 
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by Kabat and Wu, corresponding to a light chain and a heavy- 
chain region, respectively, of an antibody. 

8. The polypeptide of claim 7, wherein the first 
variable region contains a cysteine at position 43 and the 
second variable region contains a cysteine at position 105. 

9. The polypeptide of claim 1, wherein the first 
variable region is a light chain variable region (V L ) of an 
antibody and the second variable region is a heavy chain 
variable region (V H ) of the antibody. 

10. The polypeptide of claim 1, wherein the first 
variable region is an at variable chain region of a T cell 
receptor and the second variable region is a 0 variable chain 
region of the T cell receptor. 

11. A method of producing a polypeptide 
specifically binding a ligand, the polypeptide comprising a 
first variable region of a ligand binding moiety connected 
through a disulfide bond to a second variable region of the 
ligand binding moiety in framework regions of the two variable 
regions, the method comprising the steps of: 

(a) mutating a nucleic acid for the first variable 
region so that cysteine is encoded at position 42, 43, 44, 45 
or 46, and mutating a nucleic acid sequence for the second 
variable region so that cysteine is encoded at position 103, 
104, 105, or 106, such positions being determined in 
accordance with the numbering scheme published by Kabat and 
Wu, corresponding to a light chain and a heavy chain region, 
respectively, of an antibody; or 

(b) mutating a nucleic acid for the first variable 
region so that cysteine is encoded at position 43, 44, 45, 46 
or 47 and mutating a nucleic acid for the second variable 
region so that cysteine is encoded at position 98, 99, 100, or 
101 such positions being determined in accordance with the 
numbering scheme published by Kabat and Wu, corresponding to a 
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heavy chain or. a light chain region respectively of an 
antibody; then 

(c) expressing the nucleic acid for the first 
variable region and the nucleic acid for the second variable 
region in an expression system; and 

(d) recovering the polypeptide having a binding 
affinity for the antigen. 

12. The method of claim 11, wherein the method 
further comprises purifying the polypeptide. 

13 . A nucleic acid which codes for the polypeptide 
of claim 1. 

14. The nucleic acid of claim 13 which further 
includes a nucleic acid that codes for a toxin or 
pharmaceutical agent . 

15. The nucleic acid sequence of claim 14, wherein 
the toxin or pharmaceutical agent or toxin sequence is 
connected to the polypeptide by a peptide linker. 

16. The polypeptide of claim 1, wherein the first 
variable region and the second variable region are derived 
from the V L and V H , respectively, of MAb B3 . 

17. The polypeptide of claim 16, wherein the 
arginine at position 44 of the V H region and the serine at 
position 100 of the V L region are replaced by cysteines, such 
positions being determined in accordance with the numbering 
scheme published by Rabat and Wu. 

18. The polypeptide of claim 16, wherein the 
glutamine at position 105 of the V H region and the serine at 
position 43 of the V L region are replaced by cysteines, such 
positions being determined in accordance with the numbering 
scheme published by Rabat and Wu. 
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19. _The polypeptide of claim 17 L wherein the serine 
at position 95 of the V H region is replaced by tyrosine. 

20. The polypeptide of claim 16, wherein the serine 
at position 95 of the V H region is replaced by tyrosine. 

21. A nucleic acid that codes for a light chain 
variable region (V L ) of an antibody wherein the V L contains a 
cysteine at position 42, 43, 44, 45, 46, 98, 99, 100, or 101, 
such positions being determined in accordance with the 
numbering scheme published by Kabat and Wu. 

22. A nucleic acid of claim 21, which encodes a 
cysteine at position 100 of the V L . 

23. A nucleic acid of claim 21, which encodes a 
cysteine at position 43 of the V L . 

24. A nucleic acid that codes for a heavy chain 
variable region (V H ) of an antibody wherein the V H contains a 
cysteine at position 43, 44, 45, 46, 47, 103, 104, 105 or 106, 
such positions being determined in accordance with the 
numbering scheme published by Kabat and Wu. 

25. A nucleic acid of claim 24, which encodes a 
cysteine at position 44 of the V H . 

26. A nucleic acid of claim 22, which encodes a 
cysteine at position 105 of the V H . 

27. A pharmaceutical composition for inhibiting the 
growth of tumor cells comprising a polypeptide specifically 
binding tumor cells with a light chain variable region (V L ) 
derived from MAb B3 which contains a cysteine at position 42, 
43, 44, 45, 46, 98, 99 or 100 and a heavy chain variable 
region (V H ) derived from MAb B3 which contains a cysteine at 
position 43, 44, 45, 46, 47, 103, 104, 105 or 106, wherein V T 
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and V H are connected together through a disulfide bond and 
further wherein the polypeptide also comprises a toxin. 

28. The polypeptide of claim 9, wherein the V L and 
5 the V H are further connected by a peptide linker. 

29. The polypeptide of claim 1, wherein the first 
variable region and the second variable region are derived 
from V L and V H , respectively, of MAb e23. 

10 

30. The polypeptide of claim 10, wherein the or 
variable chain region contains a cysteine at position 41, 42, 
43, 44, 45, 106, 107, 108 or 109 and the & variable chain 
region contains a cysteine at position 108, 109, 110, 111, 41, 

15 42, 43, 44 or 45, such positions being determined in 

accordance with the numbering scheme published by Kabat and Wu 
corresponding to <* and 0 variable chain regions of T cell 
receptors . 
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